CSV Import of Venues then Organizers, and trying to fit our setup

Home Forums Calendar Products Events Calendar PRO CSV Import of Venues then Organizers, and trying to fit our setup

Viewing 8 posts - 1 through 8 (of 8 total)
  • Author
    Posts
  • #1168885
    Damion
    Participant

    Basically, we had a very old somewhat convoluted database setup. The old website shows event listings for mud runs / obstacle course races.

    In the old setup, the races themselves were in one table, address information was tied into the races themselves, and the main companies who ran series of events was information that had to be extracted from a column, reformatted, and parsed for duplicates.

    So, making an Organizer Table was simple enough. We took just the name of the overall company/series responsive, and borrowed some meta information from the race to appease those fields in the Organizer CSV. Everything’s great there, because there’s no dependency on time or location.

    Then, to make a Venue List, before there weren’t names of the places on file. It was,

    Event Name
    123 Fake St.
    Notrealstown, OK

    and it was -not-

    Event Name
    Pretend Civic Center and Fairgrounds
    123 Fake St.
    Notrealstown, OK

    (as an example =)

    But we needed the address data to fit into the hierarchy here of this plug-in, especially to get automated stuff done here, like showing maps, finding -by- map, geolocation, etc.

    Based on what we’d seen of the individual Edit Event screens, it looked like you had to assign each event to a venue, by name. So it didn’t make sense to try importing without names at all. How would the event reference the venue?

    So, to get it to work at all, I thought, maybe I’ll just name them something similar, like I did for Organizers. With a conventional WordPress post, I can call two pages About, and it differentiates between them by changing the URL (like /about-2/) and you rarely run into issues with referring to it by name and them having the same name….maybe a drop-down here or there, or adding them to the menu if you haven’t made a parent/child relationship.

    So I go to import Venues, and there are ~6200 events, and each had address info attached to the event in the old site (even though there’s likely some overlap if one place holds two races over time)…so there were ~6200 venues. Not amazingly efficient, but at least a 1:1 relationship ensured accuracy.

    When the Import is complete, over 1200 Venues imported (“inserted”) and the rest are updates, presumably because, as that screen says, it detected duplicates, like having similar titles.

    Unsurprisingly, the event imports went a similar way, with ~4500 real entries, and the remainder being updates that got overwritten.

    As a result, every piece of information is filled out on the site, but there’s a mismatch between two events occuring on the same day in different places now being smooshed down to one event….or an event that should be in Canada on one date having the location for an earlier version of that event which took place in Florida (because, after the import process was through, Florida was the last surviving update/overwrite of that identical “venue name”—which was, as a reminder, just the race name used for a name field for Venu Name, to make sure all fields were there, including the addresses, for the original import).

    So my question is two-fold:

    1.) Is there some workable mechanism to prevent a duplicate from updating an existing field unless -every single field- is identical? (We’ve tried helping things along by adding new fields, like “Event ID” and “Promo Code” from the old site)…

    ..and..

    2.) We can do it the right way from here on out, actually specifying the venue name as we learn it and apply it, for all future races…but do you have any suggestions as far as how to make the import data nice beforehand to make sure we avoid the issue?

    I’m not in love with parsing the database and calling every venue, for example, “John Irving Park,” “John Irving Park-2,” “John Irving Park-3,” etc. just for the sake of unique values.

    I know Recurring Events has a system in place to deal with the same thing happening in multiple times/places, but that seems like it’s for regularly scheduled things, not events that might take place sporadically three to five times scattered across the year.

    I just need some guidance on how to set things up to “talk to” the plug-in correctly in a way that it can handle our needs.

    Thanks for reading thus far, and I’m happy to attach any CSVs once I can do so privately.

    Cheers,
    Vincent J.

    #1169249
    Nico
    Member

    Howdy Vicent,

    Welcome to our support forums and thanks for reaching out to us! Also thanks for the detailed description of the migration process.

    Before jumping to your actual questions, let me explain the ‘find duplicates’ criteria the plugin is using:
    – Events: Same title, start date, end date and ‘all day’ setting.
    – Venues & Organizers: just looks at the title/name field.

    1.) Is there some workable mechanism to prevent a duplicate from updating an existing field unless -every single field- is identical? (We’ve tried helping things along by adding new fields, like “Event ID” and “Promo Code” from the old site)…

    Are the events titles the same in some cases? If so are the start date/time and end date/time also the same? But they are still different events?

    2.) We can do it the right way from here on out, actually specifying the venue name as we learn it and apply it, for all future races…but do you have any suggestions as far as how to make the import data nice beforehand to make sure we avoid the issue?

    For this case I understand you used the event name to name the venues? So the question would be the same as above, if titles are the same for various Events I guess this is the reason for the duplicates.

    I think the best might be to actually disable this ‘check for duplicates’ for the initial data migration. Taking a look to the code this doesn’t seem possible without editing the core files, but as it is a temporary edit, it might be a good idea. What do you think?

    Best,
    Nico

    #1169293
    Damion
    Participant

    Thank you for the clarification on how the duplicate-sniffing mechanism works. The screen is helpful, but it helps sometimes even just to hear it be presented a different way.

    Regarding the events, any given kind of race has the exact same title, but all have unique start dates, and being a few hours long, there shouldn’t be any that carry on past that day.

    So you might have (this is made up)

    “Crazy Fun Raceathon” Tulsa, OK 4/22/2017
    “Crazy Fun Raceathon” Culver City, CA 4/29/2017

    and since the actual events don’t list addresses, you get just the title and dates, and I’ll have

    “Crazy Fune Raceathon” 4/29/2017

    or something similar. It treated one example of the series as the only one.

    Then, when it went to go find a venue by that same name, and it searched through the venues (that were already condensed by their own anti-duplicate process), it only found that the venue “Crazy Fun Raceathon” is a venue in Tulsa, OK. The other locations were eliminated for being otherwise identical.

    so my finished listing looks like

    “Crazy Fun Raceathon” Tulsa, OK 4/29/2017

    Which is something that doesn’t exist. -A- correct location and -a- correct date, but switched out to not match the date or location to which they belong.

    I don’t mind editing the core files, and as a developer I’m very comfortable working with code. Once we’re “patched” we can happily so as I said and do it the “right way” within this new process (which is good and clean and smart and I like it, but it’s different from where we left)….

    Ideally, if there could be a way that it would prevent a duplicate if there’s a totally separate address, that would work, but I recognize that, as mentioned before, the system would have a hard time figuring out how to refer to those venues to connect them to the appropriate listing, since without interference they’d find a bunch of matches for the exact same name.

    It’s clunky, but I thought about concatenating the venue names. In other words, “Crazy Fun Raceathon” is the event, but there’s two venues, “Crazy Fun Raceathon – Tulsa” and “Crazy Fun Raceathon – Culver City”…then I could programmatically add a body class to events that were before this migration date, and use CSS to hide them so it didn’t look silly and people only saw the address proper.

    What was the idea for editing the core for the events duplicates, if you don’t mind me asking?

    Thanks again for your help so far, and the quick response time.

    #1170302
    Nico
    Member

    Thanks for the follow-up Vincent! Glad to hear you are comfortable working with code 🙂

    Now I see two possibilities:

    • Craft unique name for events and venues: as you mentioned, concatenate information to create unique names. Event title would be: ‘Event name + event date’, and venues titles: ‘Event name + event location’. This way you’ll have all unique entries.
    • Override core files: this is generally not suggested but as this would be a temporary patch to batch import the initial data of the site I see no problem in doing this. The files controlling the imports are ‘wp-content/plugins/the-events-calendar/src/Tribe/Importer/File_Importer_Events.php’, and in the same path ‘File_Importer_Venues.php’. The method match_existing_post of each Class (file) is the check for duplicates, you can modify the code (in both files) to:

      protected function match_existing_post( array $record ) {
      return 0;
      }

      and try to import the data again, hopefully it will crate the exact amount of events and venues you have.

    You can also combine both options if you need to! Also, I think it’s a great idea to tag or categorize all imported events just in case you’ll need to do some fixes on the front-end later!

    Please let me know if this works for you,
    Best,
    Nico

    #1170839
    Damion
    Participant

    Thanks for the update. I just now saw this; didn’t realize notification emails weren’t coming in.

    For presentation’s sake, and a few technical and user experience issues, it would be a major issue to have to add to the event names themselves.

    We have a lovely bulk delete plugin that lets us do trial and error pretty cleanly with getting these records in, so I think I’ll mix and match. Do the concatenated venues and edit the plugin for event import and see how close we get.

    I’ll have to double check the original database records tomorrow at work, then try it, then I’ll come back and let you know how it went. If nothing else it moves the needle way closer to the desired end result.

    Thanks so much for checking in again so soon and providing traction on this issue. So far both the free and Pro versions seem to be a crazy good value for how well they deliver, for those people who need what they offer.

    #1170967
    Nico
    Member

    You are welcome Damion!

    Yeah, I imagined that combining both solutions might be the way to go for you. Hope the import goes well and remember to revert the changes made to the core files once you are done.

    Regarding the bulk delete of previous imports make sure to check in the database if post meta is correctly removed as well! If not it might be a good idea to do the import in a clean new database once you can fine tune the import method.

    Finally thanks for the kind words about our plugins, I’ll share them with the rest of the team 😉

    Have a great weekend,
    Nico

    #1171244
    Damion
    Participant

    Magic!

    Venues still had some overwrites, but those remaining appear to be warranted as previously unseen duplicates (in every way, not just the title it was looking for).

    Events did overwrites again, even though important information was different between otherwise identical time and identical date events (like location, as well as custom meta we added), it took some and not others. As it was designed to do, but our special case needed leniency.

    So I changed the core code, as you said, and far from being fluffy or going madcap, it just generated new events with separate information without trying to qualify it with matches.

    Every last one was an insert, and not an update, and the test samples for nearly-identical events (across all meta, keys and values) came through the other side perfectly!

    Thanks again for your guidance. It really is a blessing to have this tackled, so that everything else can use the good model of data to its fullest abilities.

    Have a great weekend!

    #1171903
    Nico
    Member

    Wooot! Stocked to hear Damion!

    Great work, hopefully this effort will make upcoming work easier 🙂

    I’ll go ahead and close out this thread, but if you need help with anything else please don’t hesitate to create a new one and we will be happy to assist you.

    Hope you have a great week,
    Nico

Viewing 8 posts - 1 through 8 (of 8 total)
  • The topic ‘CSV Import of Venues then Organizers, and trying to fit our setup’ is closed to new replies.