在Linux上从MySQL迁移到PostgreSQL (Kubuntu)

时间:2022-06-30 20:14:55

A long time ago on a system far, far away...

Trying to migrate a database from MySQL to PostgreSQL. All the documentation I have read covers, in great detail, how to migrate the structure. I have found very little documentation on migrating the data. The schema has 13 tables (which have been migrated successfully) and 9 GB of data.

尝试将数据库从MySQL迁移到PostgreSQL。我读过的所有文档都详细介绍了如何迁移结构。我几乎没有找到迁移数据的文档。该模式有13个表(已成功迁移)和9 GB数据。

MySQL version: 5.1.x
PostgreSQL version: 8.4.x

MySQL版本:5.1。x PostgreSQL版本:8.4.x

I want to use the R programming language to analyze the data using SQL select statements; PostgreSQL has PL/R, but MySQL has nothing (as far as I can tell).

我想使用R编程语言使用SQL select语句分析数据;PostgreSQL有PL/R,但是MySQL什么都没有(据我所知)。

A New Hope

Create the database location (/var has insufficient space; also dislike having the PostgreSQL version number everywhere -- upgrading would break scripts!):

创建数据库位置(/var空间不足;也不喜欢到处都有PostgreSQL版本号——升级会破坏脚本!

  1. sudo mkdir -p /home/postgres/main
  2. sudo mkdir - p /home/postgres/main
  3. sudo cp -Rp /var/lib/postgresql/8.4/main /home/postgres
  4. sudo cp -Rp /var/lib/postgresql/8.4/main /home/postgres。
  5. sudo chown -R postgres.postgres /home/postgres
  6. sudo乔恩- r postgres。postgres /home/postgres
  7. sudo chmod -R 700 /home/postgres
  8. sudo chmod - r700 /home/postgres
  9. sudo usermod -d /home/postgres/ postgres
  10. sudo usermod -d /home/postgres/ postgres

All good to here. Next, restart the server and configure the database using these installation instructions:

所有的好。接下来,重新启动服务器并使用以下安装说明配置数据库:

  1. sudo apt-get install postgresql pgadmin3
  2. sudo apt-get install postgresql pgadmin3
  3. sudo /etc/init.d/postgresql-8.4 stop
  4. sudo /etc/init.d/postgresql - 8.4停止
  5. sudo vi /etc/postgresql/8.4/main/postgresql.conf
  6. sudo vi /etc/postgresql/8.4/main/postgresql.conf
  7. Change data_directory to /home/postgres/main
  8. 改变data_directory /home/postgres/main
  9. sudo /etc/init.d/postgresql-8.4 start
  10. sudo /etc/init.d/postgresql - 8.4开始
  11. sudo -u postgres psql postgres
  12. sudo -u postgres psql postgres
  13. \password postgres
  14. postgres \密码
  15. sudo -u postgres createdb climate
  16. 苏多-u地区气候恶劣
  17. pgadmin3
  18. pgadmin3

Use pgadmin3 to configure the database and create a schema.

使用pgadmin3配置数据库并创建模式。

The episode continues in a remote shell known as bash, with both databases running, and the installation of a set of tools with a rather unusual logo: SQL Fairy.

这一情节继续在一个名为bash的远程shell中进行,该shell同时运行两个数据库,并安装了一组具有相当不同寻常标志的工具:SQL仙女。

  1. perl Makefile.PL
  2. perl makefile . pl
  3. sudo make install
  4. sudo make install
  5. sudo apt-get install perl-doc (strangely, it is not called perldoc)
  6. sudo apt-get install perl-doc(奇怪的是,它并不叫perldoc)
  7. perldoc SQL::Translator::Manual
  8. perldoc SQL::翻译::手册

Extract a PostgreSQL-friendly DDL and all the MySQL data:

提取一个对postgresql友好的DDL和所有MySQL数据:

  1. sqlt -f DBI --dsn dbi:mysql:climate --db-user user --db-password password -t PostgreSQL > climate-pg-ddl.sql
  2. sqlt -f DBI——dsn DBI:mysql:climate——db-user用户——db-password -t PostgreSQL > climate-pg-ddl.sql
  3. Edit climate-pg-ddl.sql and convert the identifiers to lowercase, and insert the schema reference (using VIM):
    • :%s/"\([A-Z_]*\)"/\L\1/g
    • :% s /“\([A-Z_]* \)”/ \ \ 1 / g
    • :%s/ TABLE / TABLE climate./g
    • :%s/表/表气候
    • :%s/ on / on climate./g
    • :%s/ on / on / on气候
  4. 编辑climate-pg-ddl。sql并将标识符转换为小写,并插入模式引用(使用VIM)::%s/"\([A-Z_]*\)"/\L\1/g:%s/ TABLE / TABLE climate. "/g:%s/ on / on / on气候
  5. mysqldump --skip-add-locks --complete-insert --no-create-db --no-create-info --quick --result-file="climate-my.sql" --databases climate --skip-comments -u root -p
  6. mysqldump -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -sql——数据库气候——skip-comments -u root -p

It might be worthwhile to simply rename the tables and columns in MySQL to lowercase:

简单地将MySQL中的表和列重命名为小写:

  1. select concat( 'RENAME TABLE climate.', TABLE_NAME, ' to climate.', lower(TABLE_NAME), ';' ) from INFORMATION_SCHEMA.TABLES where TABLE_SCHEMA='climate';
  2. 选择concat(“重命名表气候”)。“,TABLE_NAME,”指向气候。从INFORMATION_SCHEMA发出的'、lower(TABLE_NAME)、';'表在TABLE_SCHEMA =“气候”;
  3. Execute the commands from the previous step.
  4. 执行前面步骤中的命令。
  5. There is probably a way to do the same for columns; I changed them manually because it was faster than figuring out how to write the query.
  6. 对于列,可能也有同样的方法;我手动更改了它们,因为它比计算如何编写查询要快。

The Database Strikes Back

Recreate the structure in PostgreSQL as follows:

在PostgreSQL中重新创建如下结构:

  1. pgadmin3 (switch to it)
  2. pgadmin3(开关)
  3. Click the Execute arbitrary SQL queries icon
  4. 单击Execute任意SQL查询图标。
  5. Open climate-pg-ddl.sql
  6. 打开climate-pg-ddl.sql
  7. Search for TABLE " replace with TABLE climate." (insert the schema name climate)
  8. 搜索“替换为表气候”。(插入模式名称气候)
  9. Search for on " replace with on climate." (insert the schema name climate)
  10. 搜索“替换为on climate”。(插入模式名称气候)
  11. Press F5 to execute
  12. 按F5执行

This results in:

这将导致:

Query returned successfully with no result in 122 ms.

Replies of the Jedi

At this point I am stumped.

在这一点上我被难住了。

  • Where do I go from here (what are the steps) to convert climate-my.sql to climate-pg.sql so that they can be executed against PostgreSQL?
  • 我要从这里(什么是步骤)转换气候-我的。climate-pg sql。sql以便它们可以对PostgreSQL执行?
  • How to I make sure the indexes are copied over correctly (to maintain referential integrity; I don't have constraints at the moment to ease the transition)?
  • 如何确保索引被正确复制(以保持引用完整性;我现在没有任何限制来缓和过渡)?
  • How do I ensure that adding new rows in PostgreSQL will start enumerating from the index of the last row inserted (and not conflict with an existing primary key from the sequence)?
  • 如何确保在PostgreSQL中添加新行将从插入的最后一行的索引开始枚举(而不与序列中现有的主键冲突)?
  • How do you ensure the schema name comes through when transforming the data from MySQL to PostgreSQL inserts?
  • 在将数据从MySQL转换为PostgreSQL插入时,如何确保模式名通过呢?

Resources

A fair bit of information was needed to get this far:

我们需要一些信息来达到这个目的:

Thank you!

谢谢你!

4 个解决方案

#1


4  

What I usually do for such migrations is two-fold:

对于这种迁移,我通常会做两件事:

  • Extract the whole database definition from MySQL and adapt it to PostgreSQL syntax.
  • 从MySQL中提取整个数据库定义并使其适应PostgreSQL语法。
  • Go over the database definition and transform it to take advantage of functionality in PostgreSQL that doesn't exist in MySQL.
  • 检查数据库定义并对其进行转换,以利用PostgreSQL中不存在于MySQL的功能。

Then do the conversion, and write a program in whatever language you are most comfortable with that accomplishes the following:

然后进行转换,用你最熟悉的语言编写一个程序,完成以下任务:

  • Reads the data from the MySQL database.
  • 从MySQL数据库中读取数据。
  • Performs whatever transformation is necessary on the data to be stored in the PostgreSQL database.
  • 对要存储在PostgreSQL数据库中的数据执行任何必要的转换。
  • Saves the now-transformed data in the PostgreSQL database.
  • 在PostgreSQL数据库中保存现在转换的数据。

Redesign the tables for PostgreSQL to take advantage of its features.

为PostgreSQL重新设计表,以利用其特性。

If you just do something like use a sed script to convert the SQL dump from one format to the next, all you are doing is putting a MySQL database in a PostgreSQL server. You can do that, and there will still be some benefit from doing so, but if you're going to migrate, migrate fully.

如果您只是使用sed脚本将SQL转储从一种格式转换为另一种格式,那么您所做的就是将MySQL数据库放入PostgreSQL服务器。你可以这样做,这样做仍然会有一些好处,但是如果你打算迁移,完全迁移。

It will involve a little bit more up-front time spent, but I have yet to come across a situation where it isn't worth it.

这需要更多的前期时间,但我还没有遇到过不值得的情况。

#2


2  

Convert the mysqldump file to a PostgreSQL-friendly format

Convert the data as follows (do not use mysql2pgsql.perl):

转换数据如下(不要使用mysql2pgsql.perl):

  1. Escape the quotes.

    逃避引号。

    sed "s/\\\'/\'\'/g" climate-my.sql | sed "s/\\\r/\r/g" | sed "s/\\\n/\n/g" > escaped-my.sql

    sed“s / \ \ \ / \ \ ' / g”climate-my。sql | sed“s / \ \ \ / \ r / g”| sed“s / \ \ \ n \ n / g”> escaped-my.sql

  2. Replace the USE "climate"; with a search path and comment the comments:

    替换使用的“气候”;使用搜索路径和评论:

    sed "s/USE \"climate\";/SET search_path TO climate;/g" escaped-my.sql | sed "s/^\/\*/--/" > climate-pg.sql

    sed /USE \“climate\”;/SET search_path TO climate;/g“逃避-我”。sql | sed“s / ^ \ \ * /,/ " > climate-pg.sql

  3. Connect to the database.

    连接到数据库。

    sudo su - postgres
    psql climate

    sudo su - postgres psql气候

  4. Set the encoding (mysqldump ignores its encoding parameter) and then execute the script.

    设置编码(mysqldump忽略其编码参数),然后执行脚本。

    \encoding iso-8859-1
    \i climate-pg.sql

    iso - 8859 - 1 \ \编码我climate-pg.sql

This series of steps will probably not work for complex databases with many mixed types. However, it works for integers, varchars, and floats.

此系列步骤可能不适用于具有多种混合类型的复杂数据库。但是,它适用于整数、varchars和浮点数。

Indexes, primary keys, and sequences

Since mysqldump included the primary keys when generating the INSERT statements, they will trump the table's automatic sequence. The sequences for all tables remained 1 upon inspection.

由于mysqldump在生成INSERT语句时包含了主键,因此它们将超过表的自动序列。所有表的序列在检查后仍然为1。

Set the sequence after import

Using the ALTER SEQUENCE command will set them to whatever value is needed.

使用ALTER SEQUENCE命令将它们设置为所需的任何值。

Schema Prefix

There is no need to prefix tables with the schema name. Use:

不需要在表前面加上模式名。使用:

SET search_path TO climate;

#3


2  

If you've converted a schema then migrating data would be the easy part:

如果您已经转换了模式,那么迁移数据将是容易的部分:

  • dump schema from PostgreSQL (you said that you've converted schema to postgres, so we will dump it for now, as we will be deleting and recreating target database, to have it cleaned):

    从PostgreSQL转储模式(您说您已经将模式转换为postgres,因此我们将暂时转储它,因为我们将删除并重新创建目标数据库,以清理它):

    pg_dump dbname > /tmp/dbname-schema.sql
    
  • split schema to 2 parts — /tmp/dbname-schema-1.sql containing create table statements, /tmp/dbname-schema-2.sql — the rest. PostgreSQL needs to import data before foreign keys, triggers etc. are imported, but after table definitions are imported.

    将模式分割为两部分——/tmp/dbname-schema-1。包含create table语句的sql, /tmp/dbname-schema-2。sql -。PostgreSQL需要在导入外键、触发器等之前导入数据,但是在导入表定义之后。

  • recreate database with only 1 part of schema:

    仅使用模式的1个部分重新创建数据库:

    drop database dbname
    create database dbname
    \i /tmp/dbname-schema-1.sql
    -- now we have tables without data, triggers, foreign keys etc.
    
  • import data:

    导入数据:

    (
       echo 'start transaction';
       mysqldump --skip-quote-names dbname | grep ^INSERT;
       echo 'commit'
    ) | psql dbname
    -- now we have tables with data, but without triggers, foreign keys etc.
    

    A --skip-quote-names option is added in MySQL 5.1.3, so if you have older version, then install newer mysql temporarily in /tmp/mysql (configure --prefix=/tmp/mysql && make install should do) and use /tmp/mysql/bin/mysqldump.

    在MySQL 5.1.3中添加了一个-skip-quote-names选项,因此如果您有较旧的版本,那么可以在/tmp/ MySQL中临时安装更新的MySQL(配置——前缀=/tmp/ MySQL & make install),并使用/tmp/ MySQL /bin/mysqldump。

  • import the rest of schema:

    导入模式的其余部分:

    psql dbname
    start transaction
    \i /tmp/dbname-schema-2.sql
    commit
    -- we're done
    

#4


0  

Check out etlalchemy. It allows you migrate from MySQL to PostgreSQL, or between several other databases, in 4 lines of Python. You can read more about it here.

查看etlalchemy。它允许您在4行Python中从MySQL迁移到PostgreSQL,或者在其他几个数据库之间迁移。你可以在这里读到更多。

To install: pip install etlalchemy

安装:pip安装etlalchemy

To run:

运行:

from etlalchemy import ETLAlchemySource, ETLAlchemyTarget
# Migrate from MySQL to PostgreSQL
src = ETLAlchemySource("mysql://user:passwd@hostname/dbname")
tgt = ETLAlchemyTarget("postgresql://user:passwd@hostname/dbname",
                          drop_database=True)
tgt.addSource(src)
tgt.migrate()

#1


4  

What I usually do for such migrations is two-fold:

对于这种迁移,我通常会做两件事:

  • Extract the whole database definition from MySQL and adapt it to PostgreSQL syntax.
  • 从MySQL中提取整个数据库定义并使其适应PostgreSQL语法。
  • Go over the database definition and transform it to take advantage of functionality in PostgreSQL that doesn't exist in MySQL.
  • 检查数据库定义并对其进行转换,以利用PostgreSQL中不存在于MySQL的功能。

Then do the conversion, and write a program in whatever language you are most comfortable with that accomplishes the following:

然后进行转换,用你最熟悉的语言编写一个程序,完成以下任务:

  • Reads the data from the MySQL database.
  • 从MySQL数据库中读取数据。
  • Performs whatever transformation is necessary on the data to be stored in the PostgreSQL database.
  • 对要存储在PostgreSQL数据库中的数据执行任何必要的转换。
  • Saves the now-transformed data in the PostgreSQL database.
  • 在PostgreSQL数据库中保存现在转换的数据。

Redesign the tables for PostgreSQL to take advantage of its features.

为PostgreSQL重新设计表,以利用其特性。

If you just do something like use a sed script to convert the SQL dump from one format to the next, all you are doing is putting a MySQL database in a PostgreSQL server. You can do that, and there will still be some benefit from doing so, but if you're going to migrate, migrate fully.

如果您只是使用sed脚本将SQL转储从一种格式转换为另一种格式,那么您所做的就是将MySQL数据库放入PostgreSQL服务器。你可以这样做,这样做仍然会有一些好处,但是如果你打算迁移,完全迁移。

It will involve a little bit more up-front time spent, but I have yet to come across a situation where it isn't worth it.

这需要更多的前期时间,但我还没有遇到过不值得的情况。

#2


2  

Convert the mysqldump file to a PostgreSQL-friendly format

Convert the data as follows (do not use mysql2pgsql.perl):

转换数据如下(不要使用mysql2pgsql.perl):

  1. Escape the quotes.

    逃避引号。

    sed "s/\\\'/\'\'/g" climate-my.sql | sed "s/\\\r/\r/g" | sed "s/\\\n/\n/g" > escaped-my.sql

    sed“s / \ \ \ / \ \ ' / g”climate-my。sql | sed“s / \ \ \ / \ r / g”| sed“s / \ \ \ n \ n / g”> escaped-my.sql

  2. Replace the USE "climate"; with a search path and comment the comments:

    替换使用的“气候”;使用搜索路径和评论:

    sed "s/USE \"climate\";/SET search_path TO climate;/g" escaped-my.sql | sed "s/^\/\*/--/" > climate-pg.sql

    sed /USE \“climate\”;/SET search_path TO climate;/g“逃避-我”。sql | sed“s / ^ \ \ * /,/ " > climate-pg.sql

  3. Connect to the database.

    连接到数据库。

    sudo su - postgres
    psql climate

    sudo su - postgres psql气候

  4. Set the encoding (mysqldump ignores its encoding parameter) and then execute the script.

    设置编码(mysqldump忽略其编码参数),然后执行脚本。

    \encoding iso-8859-1
    \i climate-pg.sql

    iso - 8859 - 1 \ \编码我climate-pg.sql

This series of steps will probably not work for complex databases with many mixed types. However, it works for integers, varchars, and floats.

此系列步骤可能不适用于具有多种混合类型的复杂数据库。但是,它适用于整数、varchars和浮点数。

Indexes, primary keys, and sequences

Since mysqldump included the primary keys when generating the INSERT statements, they will trump the table's automatic sequence. The sequences for all tables remained 1 upon inspection.

由于mysqldump在生成INSERT语句时包含了主键,因此它们将超过表的自动序列。所有表的序列在检查后仍然为1。

Set the sequence after import

Using the ALTER SEQUENCE command will set them to whatever value is needed.

使用ALTER SEQUENCE命令将它们设置为所需的任何值。

Schema Prefix

There is no need to prefix tables with the schema name. Use:

不需要在表前面加上模式名。使用:

SET search_path TO climate;

#3


2  

If you've converted a schema then migrating data would be the easy part:

如果您已经转换了模式,那么迁移数据将是容易的部分:

  • dump schema from PostgreSQL (you said that you've converted schema to postgres, so we will dump it for now, as we will be deleting and recreating target database, to have it cleaned):

    从PostgreSQL转储模式(您说您已经将模式转换为postgres,因此我们将暂时转储它,因为我们将删除并重新创建目标数据库,以清理它):

    pg_dump dbname > /tmp/dbname-schema.sql
    
  • split schema to 2 parts — /tmp/dbname-schema-1.sql containing create table statements, /tmp/dbname-schema-2.sql — the rest. PostgreSQL needs to import data before foreign keys, triggers etc. are imported, but after table definitions are imported.

    将模式分割为两部分——/tmp/dbname-schema-1。包含create table语句的sql, /tmp/dbname-schema-2。sql -。PostgreSQL需要在导入外键、触发器等之前导入数据,但是在导入表定义之后。

  • recreate database with only 1 part of schema:

    仅使用模式的1个部分重新创建数据库:

    drop database dbname
    create database dbname
    \i /tmp/dbname-schema-1.sql
    -- now we have tables without data, triggers, foreign keys etc.
    
  • import data:

    导入数据:

    (
       echo 'start transaction';
       mysqldump --skip-quote-names dbname | grep ^INSERT;
       echo 'commit'
    ) | psql dbname
    -- now we have tables with data, but without triggers, foreign keys etc.
    

    A --skip-quote-names option is added in MySQL 5.1.3, so if you have older version, then install newer mysql temporarily in /tmp/mysql (configure --prefix=/tmp/mysql && make install should do) and use /tmp/mysql/bin/mysqldump.

    在MySQL 5.1.3中添加了一个-skip-quote-names选项,因此如果您有较旧的版本,那么可以在/tmp/ MySQL中临时安装更新的MySQL(配置——前缀=/tmp/ MySQL & make install),并使用/tmp/ MySQL /bin/mysqldump。

  • import the rest of schema:

    导入模式的其余部分:

    psql dbname
    start transaction
    \i /tmp/dbname-schema-2.sql
    commit
    -- we're done
    

#4


0  

Check out etlalchemy. It allows you migrate from MySQL to PostgreSQL, or between several other databases, in 4 lines of Python. You can read more about it here.

查看etlalchemy。它允许您在4行Python中从MySQL迁移到PostgreSQL,或者在其他几个数据库之间迁移。你可以在这里读到更多。

To install: pip install etlalchemy

安装:pip安装etlalchemy

To run:

运行:

from etlalchemy import ETLAlchemySource, ETLAlchemyTarget
# Migrate from MySQL to PostgreSQL
src = ETLAlchemySource("mysql://user:passwd@hostname/dbname")
tgt = ETLAlchemyTarget("postgresql://user:passwd@hostname/dbname",
                          drop_database=True)
tgt.addSource(src)
tgt.migrate()