I noticed something weird. If I execute the script multiple times with the same command line arguments, the script outputs the items in a possibly different order with each run. Why?
Hi, investigated;
reason is that the tool saves its data in a perl hash. And they are, by definition, unordered.
So, if this is really important, it might be an option to add the data into a hash using systematic keys (i.e. add a entry counter number to the front of the key) and sort the entries by that. That however will break duplicate handling of several lines for the same entry, and we need a solution for that too.
But again, LDIF itself is not ordered per se, so - is this important to you, so you need this fixed? What is your usecase?
Benedikt,
in my view giving a specific order to the components in the output shouldn't be required at all. Though I still wonder why there's no fixed and predictable "default order", I mean as a matter of fact the order of the components in the output changes possibly at each run randomly. I can believe no component is missing anyway, though this random behavior suggests the idea of something out of control, and mysterious at the same time - at least for me, since likely I don't know yet all that I should need to know about perl hash tables.
I'd like at least to understand where the random behavior does come from.
Thanks!
Andrea
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Do you please have test data for me to reproduce?
My gut says, this is related to perls internal array handling.
But that should not matter - LDIF content files are not ordered.
I attach a csv file with just three records.
Thanks!
Andrea
Hi, investigated;
reason is that the tool saves its data in a perl hash. And they are, by definition, unordered.
So, if this is really important, it might be an option to add the data into a hash using systematic keys (i.e. add a entry counter number to the front of the key) and sort the entries by that. That however will break duplicate handling of several lines for the same entry, and we need a solution for that too.
But again, LDIF itself is not ordered per se, so - is this important to you, so you need this fixed? What is your usecase?
Ref: https://stackoverflow.com/questions/10901084/how-can-i-sort-a-perl-hash-on-values-and-order-the-keys-correspondingly-in-two#10901159
An implementation idea could be to maintain a final dataref structure, where we have an ordered list of lines pointing to the data in the hash, and the list can be ordered or sorted; this way we might print the entry at the first CSV location and preserve the dupe-check unchanged.
Last edit: Benedikt Hallinger 2023-12-31
Benedikt,
in my view giving a specific order to the components in the output shouldn't be required at all. Though I still wonder why there's no fixed and predictable "default order", I mean as a matter of fact the order of the components in the output changes possibly at each run randomly. I can believe no component is missing anyway, though this random behavior suggests the idea of something out of control, and mysterious at the same time - at least for me, since likely I don't know yet all that I should need to know about perl hash tables.
I'd like at least to understand where the random behavior does come from.
Thanks!
Andrea
Fixed in 1.2:
https://sourceforge.net/projects/csv2ldif2/files/csv2ldif2-1.2.tar.gz/download
Thanks, Benedikt!
Andrea