Trifacta, the Data Engineering Cloud company, today announced Google Cloud Dataprep by Trifacta now leverages the full power of SQL to transform data inside BigQuery. These new capabilities accelerate data engineering tasks up to 20x by eliminating the need to move data outside of the warehouse.
With BigQuery pushdown, Google Cloud Dataprep supports Extract Load Transform (ELT) workflows that operate directly on data in the warehouse. Using the power of SQL, the data can be transformed in-place, making it efficient for data manipulations such as filters, joins, unions, and aggregation. Dataprep automatically understands when a data pipeline can be partially or fully translated into BigQuery SQL statements, enabling enterprises to run workloads for data analytics at any scale with Google Dataprep by Trifacta. BigQuery execution provides incredibly high performance and complete flexibility, while optimizing cost. Dataprep integrates natively with BigQuery security at the user level as well as the service level by tying into standards such as OAuth and IAM.
“BigQuery is foundational to our customers’ cloud data warehouse and data lake initiatives,” said Sudhir Hasbe, Sr. Director, Product Management, Smart Analytics at Google Cloud. “Business users are always hungry for more data and expect speed, agility, and empowerment to prepare data for their analytics. With Dataprep leveraging the full power of BigQuery to clean and transform data, users have a perfect companion to get data ready for analysis.”
“Most of the data we prepare with Google Cloud Dataprep by Trifacta is used to organize product hierarchies at Amway. This data is already loaded into our BigQuery product master database. The addition of BigQuery SQL pushdown is of tremendous value to us in terms of job runtime, while maintaining the same user experience to author our wrangling recipes,” said Amway Corporation Senior Data Engineer of Global Data and Analytics Kevin Schaefer. “Knowing that Dataprep jobs can either run on Dataflow or BigQuery, we can optimize our consumption based on prioritized use cases.”
“A big requirement for our customers is to not move or cache any data outside of the warehouse. With this new capability, data can remain in the database enabling enterprises to run workloads with the highest security, amazing performance, and at any scale,” said Trifacta CTO and co-founder Sean Kandel.
To learn more about the trends and technology in data engineering, register now for the Wrangle Summit on April 7-9, 2021, the first conference dedicated exclusively to data engineering hosted by Trifacta and Google Cloud.
The Trifacta Data Engineering Cloud leverages decades of innovative research in human-computer interaction, scalable data management, and machine learning to make the process of preparing data and engineering data products faster and more intuitive. Around the globe, tens of thousands of users at more than 10,000 companies, including leading brands like Deutsche Boerse, Google, Kaiser Permanente, New York Life and PepsiCo, are unlocking the potential of their data with Trifacta’s market-leading data engineering platform. Learn more at trifacta.com.