Init data only using update sql scripts

Hi, I want some data to be imported once my database created, so I put my customized scirpts into *-create-db.sql, and in case of these scripts wouldn’t be executed if my workmates already created the database in their own database instance, I also put these scripts into update sql folder. It works well, no matter the database is new created or updated.

Here comes the question, I think these scripts of ‘create’ or ‘update’ are duplicated, why not just keep the update scripts. So I tried for a totally new database, if I comments these scripts in *-create-db.sql, only keep the update scripts, while it will not work even I could see those update sql registered as below diagram (btw, what does the ‘is_init’ column mean?).
image

Could you please give some advice on this? Did I mis-understand the mechanism of the db scripts?

Thank you!

Hi,

There is some degree of truth in your words. However CUBA’s system of create/update scripts was developed keeping large and long-living enterprise systems in mind. Schema evolution for such systems can be compared with evolution of living organisms on the Earth.

I would present a such metaphor:
Create scripts: “The God has created all Universe, Earth and Human just after 7 days”.
Update scripts: “4 billion years of evolution, including several massive extinctions”.
In the end, the update scripts will contain mention of dinosaurs. But create-db scripts - will not.

If you are developing a multi-installation product, you would want to create a fresh database schema every time for new installations, and avoid running “add-rename-drop-rename-add-drop” sequence for the new database.

Second feature of the create/update scripts separation is that you can intentionally put different content there.
E.g. when you are introducing new attribute or new configuration setting, you can decide to keep old behavior for existing databases (product installations). But at the same time - enable new behavior for new databases. In such case you will have different contents in the create vs update scripts.

3 Likes

Thanks Alex.

I got you that create and update scripts have different usages for production. And this mechanism could deal with many cases on the product operating, it’s greate!

How about the developing? :thinking:

Below is my team’s situation:

  1. we have several developers on the same cuba project.
  2. each one has an isolated postgres database by docker
  3. each one may change the entity structure, i.e. create/remove/update entity

Normally, with the cuba IDEA add-on, once the entity modified, new update-scripts generated, but that depends on the database each one used. Thus conflicts come from different team members.

  1. Do we need to track the update scripts in git?
  2. If we do tracking, for those conflicts, is it good choice to maintain manually?
  3. If we do tracking, you know we frequently change the git branch, it lead to the inconsistency between developing database and the code, thus the update-scripts generated recursively(add->drop->add again->drop again). :crazy_face:

Hi,

Of course, like everything else in the project.

Note that your create+update scripts is a reflection of your data model.
You’ve added new attribute to the entity -> you should commit changes to the entity together with changes to database scripts that support this change.

After other developer pulls changes from the repository, he/she should run “Update database” to bring working database to the state that conforms to received changes in the data model.

If you change VCS branches and these branches have differences in the data model, you have two choices:

  1. Maintain different database name for each branch. E.g. “sales_master”, “sales_1_0”, “sales_feature_promo” etc. When creating new branch, create new database, e.g. by copying the database from the master database.
    In this way you avoid data schema conflicts and continue to be able to use non-empty database dumps filled with sample data.

  2. If your project doesn’t need sample data for development, just re-create the database every time after switching the branch.

I am not sure what you mean by conflicts.
If there are no conflicts in the entity definitions, why there should be conflicts in the update scripts? Are there two developers adding the same column name to the same table simultaneously?

1 Like

Thanks for your suggestion, quite helpful.

Actually if we follow your suggestion in “branches changing situation”, there is no such case.