If you are already in the fortunate position of having created a lakehouse using Azure Databricks or Synapse Spark, then you have the perfect basis. With the concept of shortcuts (for those of you familiar with SQL: comparable to a linked service), the existing Delta Lake can be used directly and without detours. There is a possibility to make the data directly available to the many compute engines in Fabric (data science / IoT / Power BI, etc.), without having to copy the data (important: you must be dealing with a Delta Lake, otherwise you have to convert the files first).
This provides an immediate benefit: Power BI can work with these data in Direct Lake mode. This means that Power BI can access the Delta files directly without having to import the data via a semantic model (formerly: dataset). Hence: you get the performance of the import mode combined with the freshness of DirectQuery... and save the scheduling time for the semantic model, a.k.a. dataset.
Building on this shortcut concept, it now becomes possible to migrate the dedicated SQL pools and use them as either a lakehouse or warehouse in Fabric. Likely steps:
- Export the dedicated SQL pools into a SQL project and import this project into Fabric.
- Meanwhile, PowerShell scripts on GitHub now support the conversion from SQL dedicated pool DDLs to Fabric DDLs.
A migration assistant announced for 2024 will even be capable of automatically redirecting endpoints.
Later on, we will talk about the tool to possibly support this.
What about Azure Data Factory? Now that Microsoft Fabric is generally available, Fabric pipelines are also available, and most activities around orchestration are fully supported. Since ADF can also write directly to OneLake, the migration effort is markedly reduced.
However, at present there are still no mapping data flows from ADF in Microsoft Fabric. This means that existing data flows must either be replaced by Dataflows Gen2 in Fabric (guide: https://aka.ms/datafactoryfabric/docs/guideformappingdataflowusers), or the mapping data flow code must be converted to Spark code. Here too the Fabric Customer Advisory Team (Fabric CAT) has provided tools: https://github.com/sethiaarun/mapping-data-flow-to-spark.
Since VNet Data Gateways will also be supported now for Dataflows Gen2, integration into existing network infrastructures now works!
Another aspect that has also been announced is that in the second half of 2024, it will be possible to mount an existing Azure Data Factory in Fabric. But we already know this principle from previous SSIS workloads.