Component to facilitate import-export data with external file

While I was searching the forum to have some ideas on implementation of import/export using Excel and CSV files, a lot of forum posts found. The implementation is not easy and straight forward today. It would be great if CUBA team considers a component to manage this.

Yes, we are going to implement a more or less generic approach to export/import in an application component. I hope it will be ready in the next couple of months.

2 Likes

Hello Konstantin, for this new version if we will have something generic for importing data? Is that I do not see this thread associated with any task on youtrack.

Hi Konstantin
Is this enhancement coming in v 6.7? I might have missed it!

Sorry guys, we haven’t made any progress regarding this yet. But we understand its importance, so it is on the roadmap for app components.

2 Likes

Well, it was one of the components I planned to implement in the next months.

TBH there were others before this, in order of priority (based on my own needs).

To Konstantin and other CUBA members: please let me know if you start working on this, so to not waste time on the same thing :wink:

Thx!
Paolo

I’d like to add that I already solved this by writing a nodejs CLI application, that’s able to import whatever excel file into whatever resource-driven REST API.
I wrote that for another API technology (Spring Data REST), but it’s easy portable to CUBA rest API v2 as well.
It uses some convention based naming on excel column headers, along with a JSON schema controller for complex column mappings, like the example below:

[
  {
    "path": "^students\\[[0-9]+\\].caregivers\\[[0-9]+\\].user$",
    "type": "entity",
    "collection": "users",
    "keyProperties": [
      "email"
    ],
    "skipOnMissingKey": true,
    "searchMethod": "findByEmail",
    "methodParams": [
      {
        "name": "email",
        "prop": "email"
      },
      {
        "name": null,
        "prop": null
      },
      {
        "name": null,
        "prop": null
      }
    ]
  },
  {
    "path": "^students\\[[0-9]+\\].caregivers\\[[0-9]+\\]$",
    "type": "entity",
    "collection": "caregivers",
    "keyProperties": [
      "user"
    ],
    "skipOnMissingKey": false,
    "searchMethod": "findByUser",
    "methodParams": [
      {
        "name": "user",
        "prop": "user._links.self"
      },
      {
        "name": null,
        "prop": null
      },
      {
        "name": null,
        "prop": null
      }
    ]
  },
  {
    "path": "^students\\[[0-9]+\\].grade.sector$",
    "type": "entity",
    "collection": "sectors",
    "keyProperties": [
      "label"
    ],
    "skipOnMissingKey": false,
    "searchMethod": "findByLabel",
    "methodParams": [
      {
        "name": "label",
        "prop": "label"
      },
      {
        "name": null,
        "prop": null
      },
      {
        "name": null,
        "prop": null
      }
    ]
  },
  {
    "path": "^students\\[[0-9]+\\].grade$",
    "type": "entity",
    "collection": "grades",
    "keyProperties": [
      "label"
    ],
    "skipOnMissingKey": false,
    "searchMethod": "findByLabel",
    "methodParams": [
      {
        "name": "label",
        "prop": "label"
      },
      {
        "name": null,
        "prop": null
      },
      {
        "name": null,
        "prop": null
      }
    ]
  },
  {
    "path": "^students\\[[0-9]+\\]$",
    "type": "entity",
    "collection": "students",
    "keyProperties": [
      "taxId"
    ],
    "skipOnMissingKey": true,
    "searchMethod": "findByTaxId",
    "methodParams": [
      {
        "name": "taxId",
        "prop": "taxId"
      },
      {
        "name": null,
        "prop": null
      },
      {
        "name": null,
        "prop": null
      }
    ]
  }
]

The searchMethod mechanism allows to automatically fetch linked entities using a custom lookup method, so that if your users gives you excel files that reference other entities via tax id for example, you’ll be able to use that to link the final entities during the import.
It also supports array columns, if some Postgres user happens to use them…
And the corresponding excel sheet has the following column headers:

familyName	givenName	taxId	grade.label	grade.sector.label	caregivers[0].user.email	caregivers[1].user.email	caregivers[0].user.taxId	caregivers[1].user.taxId	caregivers[0].user.familyName	caregivers[0].user.givenName	caregivers[1].user.familyName	caregivers[1].user.givenName	caregivers[0].user.roles[]	caregivers[1].user.roles[] 

In addition, it keeps track of the processed entities, so that if an error occur, you’re able to correct it and recover the import from the point it stopped.
If something like that is useful to someone, let me know and I’ll write some docs for it, and publish it on NPM (currently is in my internal repos only).
(PS: that’s the same mechanism I plan to use on my server side app component, just a bit more smart and automatic, 'cause we already have metadata info associated to the entities)
Bye
Paolo

1 Like

Any update on the release of this component?

1 Like

Hi Konstantin
We hope the CUBA team has made a good progress on this. Could you please shed some light?

Mortoza

1 Like

Any plans for implementing this feature in the near term? Thanks.

Nearly 5 months on - no action, no info - disappointing.

Aleksey @stukalov , please shed some light on the topic.

Hi @mikeheffernan1301 ,

I have a sample project on my github here. It organizes the infrastructure and base classes for import from xls files. So, feel free to copy-paste from or just re-use the idea in your projects.

Currently, development of the component moved to this repository. You can see some issues that are currently under discussion. Me (@stukalov ) and @mario are developing this component on our own, following principles of Open Source. We would appreciate if you could find some time to contribute as well.

Regards,
Aleksey

2 Likes

Hi Aleksey,

Pleased to see this is underway. Unfortunately I am not a Java expert so I would be hesitant to contribute. My interest is really that of a business systems analyst and I see Cuba as a fantastic tool to demonstrate business system concepts. I was able to upgrade your sample project to the 6.8.1 platform and incorporate the latest poi 3.17 jars. My main problem has been deprecation, particularly the XlsHelper.java switch for determining the cell value. After researching this I am now using the following for my particular project which seems to work OK.

switch (cell.getCellTypeEnum()) {

        case BLANK:
            return null;

        case STRING:
            String formattedCellValue = cell.getStringCellValue().replace(String.valueOf(NON_BREAKING_SPACE), " ").trim();
            return formattedCellValue.isEmpty() ? null : formattedCellValue;

        case NUMERIC:
            if (isDateCell(cell)) {
                return cell.getDateCellValue();
            }
            Double numericCellValue = cell.getNumericCellValue();
            if (!isAlmostInt(numericCellValue))
                return numericCellValue;
            else {
                if (numericCellValue > Integer.MAX_VALUE) {
                    Long value = numericCellValue.longValue();
                    return value;
                } else {
                    Integer value = numericCellValue.intValue();
                    return value;
                }
            }
        case FORMULA:
            return getFormulaCellValue(cell);
        default:
            throw new IllegalStateException(String.format("Cell type '%s' is not supported", cell.getCellTypeEnum()));
    }

Happy to help in any way I can but I’m not a Mario. Also congratulations to you and your colleagues for a great product.

Go Cuba!

Mike

1 Like

Hi @mikeheffernan1301,

Thank you very much!

hi,

as @stukalov already mentioned we are working on the data-import app component which should allow you easily to import data from various sources into your application.

I just released a preview version 0.1.0, which you can checkout. More details can be found in the corresponding announcement:

Please give me feedback and feature wishes. There is already something in the making for version 0.2.0 which you can check out as well.

Bye
Mario

Hi Mario
Thank you so much for volunteering this app component! It looks very good, I shall give a try shortly. From the eagerness of the community it is obvious how much it has been expected and you did a great job for the community.

Few more features request:

1- option to create excel / CSV template for reimport. Create template options with and without data.
2- how the related entities will be handled in import, if it is already working, excellent! Otherwise this is a must feature to use it practically!
3- log for unsuccessful import and option to recall in such case

hi @mortozakhan,

thanks for the kind words.

Regarding your requests:

1 - i’m not really sure what you mean by that. Can you elaborate a little bit more?

2 - it is already implemented, though not documented and not very obvious ATM. Currently you can to N:1 references and even multiple levels. The only requirement is that along the way there has to be only a single result (otherwise a uniquily identification is not possible).

A concrete example can be found in this example:

There is MlbPlayer and a MlbTeam which the MlbPlayer references. In the example import file: https://github.com/mariodavid/cuba-component-data-import-tests/blob/master/example-data/mlb_players.csv
you see that it has a column “team”, but the values reference the attribute “code” of the MlbTeam. This is possible to resolve. The attribute mapper for this column has to have the value “ddcdit$MlbTeam.code” for “entity attribute”. Then the reference will be picked up accordingly…

3 - this is somewhat implemented but very rudimentary. There is an open issue regarding single transaction / transaction per entity (different import strategies: single-transaction, transaction-per-entity · Issue #18 · mariodavid/cuba-component-data-import · GitHub). But it can definitively improved, like showing which errors occured etc. If you have more detailed use cases / requirements, we can talk about it…

Bye
Mario

Hi @mario
Thanks. Here is additional details-
1- it’s about generating the excel file mapping the Entity in order to populate data for importation. Therefore, this is step before importing the excel file.

3- this is about option to commit or reject when there is any error in import. User will have option reviewing the errors before committing the excel file into database.

hi, can you tell me the reason behind it? This means, that there is not yet a file to be imported. Therefore you want to create a template that someone can fill in (with the correct column names and stuff) and then you will import it afterwards. Is that the reason? Why would you not let the user enter the data directly into the app in the first place?

3- this is about option to commit or reject when there is any error in import. User will have option reviewing the errors before committing the excel file into database.

In 0.2.0 there is a feature which deals to some degree with that: GitHub - mariodavid/cuba-component-data-import: CUBA component for easy data import

Perhaps this is also something related:

Bye
Mario