MongoDB/CouchDB: Join a table to itself

I have a relational data model I'm thinking about migrating to either MongoDB or CouchDB, and I'm trying to understand how the queries would work. Suppose I have two entities, Employees and Projects, and a many-to-many join table called Assignments. I want to query for all the projects on which two users collaborated. In SQL I could do something like this:

SELECT DISTINCT a1.project_id
FROM assignments a1, assignments a2
WHERE a1.project_id = a2.project_id
AND a1.employee_id = ?
AND a2.employee_id = ?

How would I do this in NoSQL, assuming I have Employee, Project, and Assignment "documents"? Or would you structure the documents differently, and how would that affect the query?

I'd be happy to hear answers both for Mongo's query API and Couch's map/reduce approach.


I can only share the view from the Couch, where I, um, relax every once in a while. First, you almost always have to intentionally forget SQL when dealing with document-based systems.

Second, you will have the option of either changing the structure and reducing the amount of entities to the minimum that won't require joins, or you could use views that will combine documents of different types into a single result set.

The former (redesign) is the preferred method, as joins are essentially a foreign approach for nosql, since normalization is not a requirement there. Document-based is not relational.


In MongoDB you can do it like this. I use the interactive JavaScript Shell.

First create some users:

> db.so.employee.insert({name: "Joe"})
> db.so.employee.insert({name: "Moe"})
> db.so.employee.insert({name: "Bart"})
> db.so.employee.insert({name: "Homer"})
> db.so.employee.find()
{ "_id" : ObjectId("4de35ccbcc0379536e1ac43b"), "name" : "Joe" }
{ "_id" : ObjectId("4de35ccfcc0379536e1ac43c"), "name" : "Moe" }
{ "_id" : ObjectId("4de35cd3cc0379536e1ac43d"), "name" : "Bart" }
{ "_id" : ObjectId("4de35cd7cc0379536e1ac43e"), "name" : "Homer" }

Now create some projects

> db.so.project.insert({name: "Web App A"})
> db.so.project.insert({name: "Web App B"})
> db.so.project.insert({name: "Web App C"})
> db.so.project.find();
{ "_id" : ObjectId("4de35d0fcc0379536e1ac43f"), "name" : "Web App A" }
{ "_id" : ObjectId("4de35d13cc0379536e1ac440"), "name" : "Web App B" }
{ "_id" : ObjectId("4de35d15cc0379536e1ac441"), "name" : "Web App C" }

Add the users to the projects

> db.so.project.update({name: "Web App A"}, {$push: {employees: ObjectId('4de35ccbcc0379536e1ac43b') }})
> db.so.project.update({name: "Web App A"}, {$push: {employees: ObjectId('4de35ccfcc0379536e1ac43c') }})
> db.so.project.update({name: "Web App B"}, {$push: {employees: ObjectId('4de35ccfcc0379536e1ac43c') }})
> db.so.project.update({name: "Web App C"}, {$push: {employees: ObjectId('4de35ccfcc0379536e1ac43c') }})
> db.so.project.update({name: "Web App B"}, {$push: {employees: ObjectId('4de35cd3cc0379536e1ac43d') }})
> db.so.project.update({name: "Web App C"}, {$push: {employees: ObjectId('4de35cd3cc0379536e1ac43d') }})
> db.so.project.update({name: "Web App B"}, {$push: {employees: ObjectId('4de35cd7cc0379536e1ac43e') }})

> db.so.project.find()
{ "_id" : ObjectId("4de35d0fcc0379536e1ac43f"), "employees" : [
    ObjectId("4de35ccbcc0379536e1ac43b"),
    ObjectId("4de35ccfcc0379536e1ac43c")
], "name" : "Web App A" }
{ "_id" : ObjectId("4de35d15cc0379536e1ac441"), "employees" : [
    ObjectId("4de35ccfcc0379536e1ac43c"),
    ObjectId("4de35cd3cc0379536e1ac43d")
], "name" : "Web App C" }
{ "_id" : ObjectId("4de35d13cc0379536e1ac440"), "employees" : [
    ObjectId("4de35cd3cc0379536e1ac43d"),
    ObjectId("4de35cd7cc0379536e1ac43e")
], "name" : "Web App B" }

If you now want to find all projects on which works "Joe"

> db.so.project.find({employees: ObjectId('4de35ccbcc0379536e1ac43b') }, {name: 1})
{ "_id" : ObjectId("4de35d0fcc0379536e1ac43f"), "name" : "Web App A" }

Find all projects on which Joe OR Moe works

> db.so.project.find({employees: {$in: [ObjectId('4de35ccbcc0379536e1ac43b'), ObjectId('4de35ccfcc0379536e1ac43c')] }}, {name: 1})
{ "_id" : ObjectId("4de35d0fcc0379536e1ac43f"), "name" : "Web App A" }
{ "_id" : ObjectId("4de35d15cc0379536e1ac441"), "name" : "Web App C" }
{ "_id" : ObjectId("4de35d13cc0379536e1ac440"), "name" : "Web App B" }

Find all project on which Joe AND moe works

> db.so.project.find({employees: {$all: [ObjectId('4de35ccbcc0379536e1ac43b'), ObjectId('4de35ccfcc0379536e1ac43c')] }}, {name: 1})
{ "_id" : ObjectId("4de35d0fcc0379536e1ac43f"), "name" : "Web App A" }

To get all employess for a specific project you need two queries. Finding all employees from Project C.

> db.so.project.find({name: "Web App C"}, {employees: 1})
{ "_id" : ObjectId("4de35d15cc0379536e1ac441"), "employees" : [
    ObjectId("4de35ccfcc0379536e1ac43c"),
    ObjectId("4de35cd3cc0379536e1ac43d")
] }

In your Appllication you build a new query from the returning values and build this query:

> db.so.employee.find({_id: {$in: [ObjectId('4de35ccfcc0379536e1ac43c'), ObjectId('4de35cd3cc0379536e1ac43d')] }})
{ "_id" : ObjectId("4de35ccfcc0379536e1ac43c"), "name" : "Moe" }
{ "_id" : ObjectId("4de35cd3cc0379536e1ac43d"), "name" : "Bart" }

I hoped that helped to understand how MongoDB works and how you could build relations. I used manual dereferencing here. That means i save the ObjectID directly and fetch it manually. There also exists "DBRef" and your Driver then fetch it for you.


What you are looking for is effectively an N:M mapping. In this case you're trying to map a table to itself, but this is not very different from trying to map "Employees" to "Projects".

There's a long answer over here on SO that I won't repeat here.

In your specific case, I think you need to re-design your data a little. You have an assignments table containing two data points: projectID and employeedID . This is a classic table for joining N projects to M employees.

In MongoDB, you generally don't have this "table" at all. If you want to assign employees to a project, you can just store an array of employeeID in the Project itself.

{ name: 'projectx', emps: [ 1, 2, 3] }
{ name: 'projecty', emps: [ 3, 2] }

It looks like your query is basically "find all projects where employee 2 and 3 are working together". MongoDB has an $all query operator that will do this for you with the above structure.

Please note that this is just one query. It's important to look at your whole system and identify all important queries when doing schema design.

链接地址: http://www.djcxy.com/p/15160.html

上一篇: MongoDB / CouchDB用于存储文件+复制?

下一篇: MongoDB / CouchDB:将表加入自己