如何从sql server导入数据到mongodb?

时间:2022-03-11 07:11:53

how do i import data into mongodb from sql server?

如何从sql server导入数据到mongodb?

i have these tables in sql database with following columns

我在sql数据库中有以下列的这些表

States, Cities, CityAreas

州,城市,CityAreas

States
Id Name

Cities
Id Name StatesId

CitiArea

Id Name CityId

and I want data in mongoDb Like.

我想要mongoDb中的数据。


{
      State:"Orissa",
      Cities:{
                CitiName:"Phulbani",
                CitYArea:{
                              "Phulbani","Phulbani2","Pokali","Madira"
                         }
             }
}

is there any tools or do i need to write code for this transformation of data?

是否有任何工具或我需要为这种数据转换编写代码?

2 个解决方案

#1


12  

There several possible ways to approach this from writing code in your favorite language of choice using appropriate APIs to select data, transform it and then insert it into MongoDB.

有几种可能的方法可以通过使用适当的API选择您喜欢的语言编写代码来选择数据,转换它然后将其插入MongoDB。

You can also do it using SQL, MongoDB query language and the shell. One straightforward way is to select flat data via SQL, dump it into CSV file, import it into MongoDB and use aggregation framework to transform it into the format you want.

您也可以使用SQL,MongoDB查询语言和shell来完成。一种直接的方法是通过SQL选择平面数据,将其转储到CSV文件中,将其导入MongoDB并使用聚合框架将其转换为您想要的格式。

If you are lucky enough to use a database that supports arrays or other ways of grouping rows into single list types, then you can make a single select and turn it into JSON or MongoDB insert statement.

如果您足够幸运地使用支持数组的数据库或将行分组为单个列表类型的其他方法,那么您可以进行单个选择并将其转换为JSON或MongoDB插入语句。

For these examples, I'm going to assume you want the format equivalent to a document for each city:

对于这些示例,我将假设您希望格式等效于每个城市的文档:

{
      State:"Orissa",
      City:{
           Name:"Phulbani",
           Area:[
                 "Phulbani","Phulbani2","Pokali","Madira"
                ]
           }
}

Sample data in RDBMS:

RDBMS中的示例数据:

asya=# select * from states;
 id |     name      
----+---------------
  1 | California
  2 | New York
  3 | Massachusetts
(3 rows)

asya=# select * from cities;
 id |     name      | states_id 
----+---------------+-----------
  1 | Los Angeles   |         1
  2 | San Francisco |         1
  3 | San Diego     |         1
  4 | New York      |         2
  5 | *lyn      |         2
  6 | Buffalo       |         2
  7 | Boston        |         3
(7 rows)

asya=# select * from cityarea;
 id |        name        | city_id 
----+--------------------+---------
  1 | Beacon Hill        |       7
  2 | Backbay            |       7
  3 | *line          |       7
  4 | Park Slope         |       5
  5 | Little Italy       |       4
  6 | SOHO               |       4
  7 | Harlem             |       4
  8 | West Village       |       4
  9 | SoMa               |       2
 10 | South Beach        |       2
 11 | Haight Ashbury     |       2
 12 | Cole Valley        |       2
 13 | Bunker Hill        |       1
 14 | Skid Row           |       1
 15 | Fashion District   |       1
 16 | Financial District |       1
(16 rows)

The easy way with arrays:

数组的简单方法:

SELECT 'db.cities.insert({ state:"' || states.name || '", city: { name: "' || cities.name || '", areas : [ ' || array_to_string(array_agg('"' || cityarea.name || '"'),',') || ']}});'
FROM states JOIN cities ON (states.id=cities.states_id) LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id) GROUP BY states.name, cities.name;

gives you output that can go straight into MongoDB shell:

为您提供可以直接进入MongoDB shell的输出:

 db.cities.insert({ state:"California", city: { name: "Los Angeles", areas : [ "Financial District","Fashion District","Skid Row","Bunker Hill"]}});
 db.cities.insert({ state:"California", city: { name: "San Diego", areas : [ ]}});
 db.cities.insert({ state:"California", city: { name: "San Francisco", areas : [ "Haight Ashbury","South Beach","SoMa","Cole Valley"]}});
 db.cities.insert({ state:"Massachusetts", city: { name: "Boston", areas : [ "Beacon Hill","*line","Backbay"]}});
 db.cities.insert({ state:"New York", city: { name: "*lyn", areas : [ "Park Slope"]}});
 db.cities.insert({ state:"New York", city: { name: "Buffalo", areas : [ ]}});
 db.cities.insert({ state:"New York", city: { name: "New York", areas : [ "Little Italy","West Village","Harlem","SOHO"]}});

The longer way if you don't have support for array or list types is to select joined data:

如果您不支持数组或列表类型,则更长的方法是选择已连接的数据:

asya=# SELECT states.name as state, cities.name as city, cityarea.name as area 
FROM states JOIN cities ON (states.id=cities.states_id) 
LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id);
     state     |     city      |        area        
---------------+---------------+--------------------
 California    | Los Angeles   | Financial District
 California    | Los Angeles   | Fashion District
 California    | Los Angeles   | Skid Row
 California    | Los Angeles   | Bunker Hill
 California    | San Francisco | Cole Valley
 California    | San Francisco | Haight Ashbury
 California    | San Francisco | South Beach
 California    | San Francisco | SoMa
 California    | San Diego     | 
 New York      | New York      | West Village
 New York      | New York      | Harlem
 New York      | New York      | SOHO
 New York      | New York      | Little Italy
 New York      | *lyn      | Park Slope
 New York      | Buffalo       | 
 Massachusetts | Boston        | *line
 Massachusetts | Boston        | Backbay
 Massachusetts | Boston        | Beacon Hill
(18 rows)

I used a left outer join on cityarea because in my sample data I had a city without any areas listed but I wanted to get all state, city pairs even if there was not an area listed for it.

我在cityarea上使用了一个左外连接,因为在我的样本数据中,我有一个没有列出任何区域的城市但是我想要获得所有州,城市对,即使没有列出的区域也是如此。

You can dump this out interactively or via a command line (use appropriate syntax for your RDBMS). I'll do it interactively:

您可以交互式地或通过命令行(使用适当的RDBMS语法)将其转储出来。我将以交互方式进行:

asya=# \a
Output format is unaligned.
asya=# \f
Field separator is "|".
asya=# \f ,
Field separator is ",".
asya=# \t
Showing only tuples.
asya=# \o dump.txt                                                                                                                              
asya=# SELECT states.name as state, cities.name as city, cityarea.name as area 
FROM states JOIN cities ON (states.id=cities.states_id) 
LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id);
asya=# \q

I now have a comma separated file with state, city and area as the three fields. I can load it into MongoDB via mongoimport utility:

我现在有一个逗号分隔文件,状态,城市和区域为三个字段。我可以通过mongoimport实用程序将它加载到MongoDB中:

asya$ mongoimport -d sample -c tmpcities --type csv --fields state,city,area < dump.txt 
connected to: 127.0.0.1
2014-08-05T07:41:36.744-0700 check 9 18
2014-08-05T07:41:36.744-0700 imported 18 objects

Now to transform to format I want, I use aggregation:

现在要转换为我想要的格式,我使用聚合:

mongo sample
MongoDB shell version: 2.6.4
connecting to: sample1
> db.tmpcities.aggregate(
{$group:{_id:"$city", state:{$first:"$state"}, areas:{$push:"$area"}}},
{$project:{state:1,_id:0,city:{name:"$_id", areas:"$areas"}}},
{$out:'cities'})
> db.cities.find({},{_id:0})
{ "_id" : "Boston", "state" : "Massachusetts", "areas" : [ "*line", "Backbay", "Beacon Hill" ] }
{ "_id" : "New York", "state" : "New York", "areas" : [ "West Village", "Harlem", "SOHO", "Little Italy" ] }
{ "_id" : "Buffalo", "state" : "New York", "areas" : [ "" ] }
{ "_id" : "*lyn", "state" : "New York", "areas" : [ "Park Slope" ] }
{ "_id" : "San Diego", "state" : "California", "areas" : [ "" ] }
{ "_id" : "San Francisco", "state" : "California", "areas" : [ "Cole Valley", "Haight Ashbury", "South Beach", "SoMa" ] }
{ "_id" : "Los Angeles", "state" : "California", "areas" : [ "Financial District", "Fashion District", "Skid Row", "Bunker Hill" ] }

#2


3  

Try Mongify. It takes care of all the foreign key and referential integrity constraints which exist in SQL while migrating the data in MongoDB.
As per its documentation:

试试Mongify。它处理在迁移MongoDB中的数据时SQL中存在的所有外键和引用完整性约束。根据其文件:

Mongify helps you move your data without worrying about the IDs or foreign IDs. It allows you to embed data into documents, including polymorphic associations.

Mongify可帮助您移动数据,而无需担心ID或外部ID。它允许您将数据嵌入到文档中,包括多态关联。

Hope it helps.

希望能帮助到你。

#1


12  

There several possible ways to approach this from writing code in your favorite language of choice using appropriate APIs to select data, transform it and then insert it into MongoDB.

有几种可能的方法可以通过使用适当的API选择您喜欢的语言编写代码来选择数据,转换它然后将其插入MongoDB。

You can also do it using SQL, MongoDB query language and the shell. One straightforward way is to select flat data via SQL, dump it into CSV file, import it into MongoDB and use aggregation framework to transform it into the format you want.

您也可以使用SQL,MongoDB查询语言和shell来完成。一种直接的方法是通过SQL选择平面数据,将其转储到CSV文件中,将其导入MongoDB并使用聚合框架将其转换为您想要的格式。

If you are lucky enough to use a database that supports arrays or other ways of grouping rows into single list types, then you can make a single select and turn it into JSON or MongoDB insert statement.

如果您足够幸运地使用支持数组的数据库或将行分组为单个列表类型的其他方法,那么您可以进行单个选择并将其转换为JSON或MongoDB插入语句。

For these examples, I'm going to assume you want the format equivalent to a document for each city:

对于这些示例,我将假设您希望格式等效于每个城市的文档:

{
      State:"Orissa",
      City:{
           Name:"Phulbani",
           Area:[
                 "Phulbani","Phulbani2","Pokali","Madira"
                ]
           }
}

Sample data in RDBMS:

RDBMS中的示例数据:

asya=# select * from states;
 id |     name      
----+---------------
  1 | California
  2 | New York
  3 | Massachusetts
(3 rows)

asya=# select * from cities;
 id |     name      | states_id 
----+---------------+-----------
  1 | Los Angeles   |         1
  2 | San Francisco |         1
  3 | San Diego     |         1
  4 | New York      |         2
  5 | *lyn      |         2
  6 | Buffalo       |         2
  7 | Boston        |         3
(7 rows)

asya=# select * from cityarea;
 id |        name        | city_id 
----+--------------------+---------
  1 | Beacon Hill        |       7
  2 | Backbay            |       7
  3 | *line          |       7
  4 | Park Slope         |       5
  5 | Little Italy       |       4
  6 | SOHO               |       4
  7 | Harlem             |       4
  8 | West Village       |       4
  9 | SoMa               |       2
 10 | South Beach        |       2
 11 | Haight Ashbury     |       2
 12 | Cole Valley        |       2
 13 | Bunker Hill        |       1
 14 | Skid Row           |       1
 15 | Fashion District   |       1
 16 | Financial District |       1
(16 rows)

The easy way with arrays:

数组的简单方法:

SELECT 'db.cities.insert({ state:"' || states.name || '", city: { name: "' || cities.name || '", areas : [ ' || array_to_string(array_agg('"' || cityarea.name || '"'),',') || ']}});'
FROM states JOIN cities ON (states.id=cities.states_id) LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id) GROUP BY states.name, cities.name;

gives you output that can go straight into MongoDB shell:

为您提供可以直接进入MongoDB shell的输出:

 db.cities.insert({ state:"California", city: { name: "Los Angeles", areas : [ "Financial District","Fashion District","Skid Row","Bunker Hill"]}});
 db.cities.insert({ state:"California", city: { name: "San Diego", areas : [ ]}});
 db.cities.insert({ state:"California", city: { name: "San Francisco", areas : [ "Haight Ashbury","South Beach","SoMa","Cole Valley"]}});
 db.cities.insert({ state:"Massachusetts", city: { name: "Boston", areas : [ "Beacon Hill","*line","Backbay"]}});
 db.cities.insert({ state:"New York", city: { name: "*lyn", areas : [ "Park Slope"]}});
 db.cities.insert({ state:"New York", city: { name: "Buffalo", areas : [ ]}});
 db.cities.insert({ state:"New York", city: { name: "New York", areas : [ "Little Italy","West Village","Harlem","SOHO"]}});

The longer way if you don't have support for array or list types is to select joined data:

如果您不支持数组或列表类型,则更长的方法是选择已连接的数据:

asya=# SELECT states.name as state, cities.name as city, cityarea.name as area 
FROM states JOIN cities ON (states.id=cities.states_id) 
LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id);
     state     |     city      |        area        
---------------+---------------+--------------------
 California    | Los Angeles   | Financial District
 California    | Los Angeles   | Fashion District
 California    | Los Angeles   | Skid Row
 California    | Los Angeles   | Bunker Hill
 California    | San Francisco | Cole Valley
 California    | San Francisco | Haight Ashbury
 California    | San Francisco | South Beach
 California    | San Francisco | SoMa
 California    | San Diego     | 
 New York      | New York      | West Village
 New York      | New York      | Harlem
 New York      | New York      | SOHO
 New York      | New York      | Little Italy
 New York      | *lyn      | Park Slope
 New York      | Buffalo       | 
 Massachusetts | Boston        | *line
 Massachusetts | Boston        | Backbay
 Massachusetts | Boston        | Beacon Hill
(18 rows)

I used a left outer join on cityarea because in my sample data I had a city without any areas listed but I wanted to get all state, city pairs even if there was not an area listed for it.

我在cityarea上使用了一个左外连接,因为在我的样本数据中,我有一个没有列出任何区域的城市但是我想要获得所有州,城市对,即使没有列出的区域也是如此。

You can dump this out interactively or via a command line (use appropriate syntax for your RDBMS). I'll do it interactively:

您可以交互式地或通过命令行(使用适当的RDBMS语法)将其转储出来。我将以交互方式进行:

asya=# \a
Output format is unaligned.
asya=# \f
Field separator is "|".
asya=# \f ,
Field separator is ",".
asya=# \t
Showing only tuples.
asya=# \o dump.txt                                                                                                                              
asya=# SELECT states.name as state, cities.name as city, cityarea.name as area 
FROM states JOIN cities ON (states.id=cities.states_id) 
LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id);
asya=# \q

I now have a comma separated file with state, city and area as the three fields. I can load it into MongoDB via mongoimport utility:

我现在有一个逗号分隔文件,状态,城市和区域为三个字段。我可以通过mongoimport实用程序将它加载到MongoDB中:

asya$ mongoimport -d sample -c tmpcities --type csv --fields state,city,area < dump.txt 
connected to: 127.0.0.1
2014-08-05T07:41:36.744-0700 check 9 18
2014-08-05T07:41:36.744-0700 imported 18 objects

Now to transform to format I want, I use aggregation:

现在要转换为我想要的格式,我使用聚合:

mongo sample
MongoDB shell version: 2.6.4
connecting to: sample1
> db.tmpcities.aggregate(
{$group:{_id:"$city", state:{$first:"$state"}, areas:{$push:"$area"}}},
{$project:{state:1,_id:0,city:{name:"$_id", areas:"$areas"}}},
{$out:'cities'})
> db.cities.find({},{_id:0})
{ "_id" : "Boston", "state" : "Massachusetts", "areas" : [ "*line", "Backbay", "Beacon Hill" ] }
{ "_id" : "New York", "state" : "New York", "areas" : [ "West Village", "Harlem", "SOHO", "Little Italy" ] }
{ "_id" : "Buffalo", "state" : "New York", "areas" : [ "" ] }
{ "_id" : "*lyn", "state" : "New York", "areas" : [ "Park Slope" ] }
{ "_id" : "San Diego", "state" : "California", "areas" : [ "" ] }
{ "_id" : "San Francisco", "state" : "California", "areas" : [ "Cole Valley", "Haight Ashbury", "South Beach", "SoMa" ] }
{ "_id" : "Los Angeles", "state" : "California", "areas" : [ "Financial District", "Fashion District", "Skid Row", "Bunker Hill" ] }

#2


3  

Try Mongify. It takes care of all the foreign key and referential integrity constraints which exist in SQL while migrating the data in MongoDB.
As per its documentation:

试试Mongify。它处理在迁移MongoDB中的数据时SQL中存在的所有外键和引用完整性约束。根据其文件:

Mongify helps you move your data without worrying about the IDs or foreign IDs. It allows you to embed data into documents, including polymorphic associations.

Mongify可帮助您移动数据,而无需担心ID或外部ID。它允许您将数据嵌入到文档中,包括多态关联。

Hope it helps.

希望能帮助到你。