In February 2020, we hosted the NYC dbt MeetUp at the Bowery Farming office. [Travis Dunlop](https://www.linkedin.com/in/travis-dunlop/) and I shared some learnings from our experience migrating our 387 model dbt project from Redshift to Snowflake.
Our key points were
- Redshift was great for the first 2 years of growth, but we were starting to spend more time managing and problem solving to keep our hourly run times under an hour.
- Snowflake looked to be cost-competitive in a number of trials.
- Snowflake has surprisingly weak documentation to support this transition, and in fact referenced a dbt blog post when we inquired
- We developed a workflow to compare materialized tables from the redshift and snowflake branches and automate testing of the diff. We worked through the list of tables as a team, finding and resolving differences. This process took 40 people-days.
- The biggest syntax differences for our codebase were
- Snowflake has `lateral flatten` to extract JSON into rows. Nice
- `is_null` vs `is_null_value()` distinction in snowflake for SQL vs JSON nulls
- `least` and `greatest` behavior when the set includes nulls
- we had relied `using` for succinct expression of joins which behaves differently in Snowflake, so we moved to stricter `on` syntax
- Snowflake has no `infinity` timestamp which we'd used for our slowly changing dimensions
<iframe width="560" height="315" src="https://www.youtube.com/embed/VhH614WVufM?si=jlbO1uMWz1im9bEc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>