Trifacta, the global leader in data preparation, today announced a major expansion to its platform to deliver the industry’s first data engineering cloud.
In keeping with its mission to create radical productivity for people who work with data, Trifacta’s expanded capabilities now fully support modern data engineers, who apply software development and DevOps practices to build curated, accessible data products for advanced data insights and analytics. With expectations for these data products rising and cycle times falling, the key to unlocking enterprise value is to enable a broader set of users to collaborate, experiment, and iterate quickly on data science and analytics projects.
“Trifacta is addressing the needs of modern data workers by providing a collaborative, cloud environment where users of all skill levels can come together to improve data quality and streamline data operations as they on-board, assess, and refine raw data,” said Trifacta CEO Adam Wilson. “Accelerating data preparation and democratizing ETL for these users and their cloud data warehousing projects requires an enterprise-grade data engineering platform that is open, intelligent, and self-service. We’ve created the Trifacta Data Engineering Cloud to meet these needs.”
Open: A data engineering environment needs to be flexible to operate seamlessly within end-to-end analytic workflows and integrate fully into modern tool chains. By design, the environment should be extensible enough to allow users to bring their own magic to fill in the gaps and free them from the tyranny of monolithic stacks and vendor lock-in. An open data engineering environment sustains independence and loose coupling while fostering composability of services and the interoperability that enables users to embrace the best-of-breed tools that can help solve complex data problems faster. To support this, Trifacta now offers:
- Multi-cloud Support: The Trifacta data engineering cloud offers native solutions optimized for each platform: Google, AWS, Azure. Multi-cloud support means freedom of choice and freedom from code. Users can change their minds about which platform they prefer or run different workloads in different environments without re-writing any code.
- Flexible Execution: Users can choose between ETL or ELT, or an optimal combination of the two based on cost. Flexible execution also means users have the freedom to generate SQL, Spark, Dataflow/Beam, or Python.
- Universal Connectivity: Users can connect any application with data from more than 180 enterprise data sources—both on-premise and in the cloud—and publish refined data to spreadsheets, BI and reporting tools, as well as to data science notebooks. The connectivity of the Trifacta data engineering cloud is extended with Trifacta’s REST, XML, and JDBC frameworks.
- API-driven: The Trifacta cloud integrates with any and all tool chains. Through SDKs and OpenAPI standards available in a multitude of languages, users can integrate Trifacta into existing workflows or use Trifacta to orchestrate across third-party applications, from source control, ingestion, and replication tools to catalogs and business glossaries.
Intelligent: Modern data engineering tools should be intelligent and learn from the data itself and from user interactions to automate the most complex and time consuming parts of data cleaning and transformation, improving the user experience and accelerating data-driven innovation. This intelligence should also apply to improving ongoing data operations, providing self-monitoring and, in some cases, automated remediation of issues that would normally disrupt data pipelines. To support this, Trifacta now offers:
- Predictive Transformation: The Trifacta data engineering cloud features a visual “guide and decide” interface that leverages machine learning to make understanding and resolving data transformation challenges issues intuitive to users of all backgrounds, regardless of their technical acumen. Predictive transformation capabilities include: automatically detecting and applying format to unstructured and semistructured data sets; using examples to infer transformation logic; synthesizing data models from source data; and auto-mapping data to a predefined target.
- Adaptive Data Quality: Trifacta’s active data profiling that extends beyond traditional data quality rules. Users can now more easily discover and validate data quality issues. Statistical data profiles are used to identify complex patterns, automatically suggesting possible quality rules such as integrity constraints, formatting patterns, and columns dependencies. Users are offered transformations to consider based on classifiers for probabilistic data quality rules and can more easily standardize data with support for sophisticated clustering.
- Smart Data Pipelines: Trifacta empowers users to model data flows while managing relationships across data sets and recipes. They can operationalize and automate data flows through plans that enable parallel and conditional execution, as well as pre- and post-processing. In addition, monitoring point-in-time and historical data quality trends provides the context for proactive alerting (via email, Slack, PagerDuty, and other platforms) to changes of schema and data distributions that may affect data’s fitness for use downstream.
Self-service: “See for yourself, help yourself” has always been Trifacta’s mantra, but this means different things to different users, depending on their technical know-how. The Trifacta data engineering platform expands the number and variety of people who can participate in the data engineering process. It removes bottlenecks and leverages the collective wisdom of the organization to create new and interesting data products. It provides modern data workers and analytics innovators with the tools they need to collaborate and share knowledge. It caters to less technical subject matter experts as well as to more technical developers who want to operate at differing levels of abstraction from the raw data. To support this, Trifacta is offering:
- Macros, Templates, Sharing: The Trifacta data engineering cloud features macros, shareable data flows, recipes, and templates that reduce repeated tasks, increase consistency of implementation, and enable new projects to be onboarded quickly. These reusable assets increase knowledge sharing within and across organizations, standardizing the approach to common data engineering problems.
- Knowledge Sharing: Best practices, “how tos”, discussions, certifications all come together in the new Trifacta Community. The Community extends Trifacta’s commitment to self-service by empowering users to build on each other’s innovation. More than just a place to learn, this forum is where users can easily contribute their best ideas and assets, and clone and customize what other users have done in a wide range of ETL, data quality and automation scenarios. Instead of always starting from scratch, users can tap into the Trifacta Community to benefit from each other’s hard work, creative thinking, and problem-solving.
- Usage-Based Pricing: It’s not enough for technology to encourage democratization; its pricing and packaging must also enable broad adoption. Trifacta lets users start on its data engineering cloud for free and only pay as they see value. Convenient online credit card purchasing, and “pay-as-you-go” options bring the power of data prep, ETL, and data quality to individuals, teams, and organizations of all sizes.
Enterprise-grade: Analytic innovators increasingly relying on data engineering as mission- critical. Questions of scalability and governance must be considered up front. To support this, Trifacta is offering:
- Unlimited Scalability: The Trifacta data engineering cloud is completely serverless and completely elastic. It handles it all— from individual spreadsheets to petabytes of data, from smart samples to entire data sets.
- Built-in Governance: With full audit trails and lineage, versioning and SDLC support, the Trifacta data engineering cloud tracks and manages change automatically across projects and environments.
- Support and Reliability: With follow-the-sun support, high availability, and built-in redundancy, the Trifacta data engineering cloud ensures full resilience and eliminates DevOps costs. There’s nothing to deploy. An in-product “advisor” chat bot also makes Trifacta experts and expertise available at the moment they’re needed.
- Best-in-Class Security: Authentication, authorization and encryption, combined with VPC support and a host of certifications with leading security standards (SOC2 Type 2, GDPR, etc.), ensure data is fully protected at all times.
To learn more about the trends and technology in data engineering, register now for the Wrangle Summit on April 7-9, 2021, the first conference dedicated exclusively to data engineering hosted by Trifacta and Google Cloud.
The Trifacta Data Engineering Cloud leverages decades of innovative research in human-computer interaction, scalable data management, and machine learning to make the process of preparing data and engineering data products faster and more intuitive. Around the globe, tens of thousands of users at more than 10,000 companies, including leading brands like Deutsche Boerse, Google, Kaiser Permanente, New York Life and PepsiCo, are unlocking the potential of their data with Trifacta’s market-leading data engineering platform. Learn more at trifacta.com.
“With over a thousand stores and hundreds of thousands of employees, Woolworths Australia requires careful planning and optimization of our facilities to maximize returns. Every step to produce useful data insights, from data collection to advanced analytics, influences the company’s strategy,” says Radha Goli, lead data engineer at Woolworths. “With Trifacta’s new capabilities, such as orchestration, the addition of new connectors, and enterprise operationalization, we are able to deploy data engineering more broadly and guarantee repeatable and trustworthy data outcomes to inform our business.”
“Google Cloud’s goal is to offer an analytics platform that delivers actionable insights in real-time, and we have conviction that to do that, solutions must be open, intelligent, and flexible,” said Debanjan Saha, Vice President, Data & Analytics, Google Cloud. “Trifacta’s Data Engineering Platform is built on those same principles, making Dataprep by Trifacta a perfect companion solution for Google Cloud customers to assess and remediate data quality and enable self-service analytics across their organization.”
“Clean and annotated data is essential to the success of modern machine learning and artificial intelligence. However, achieving clean data takes time and, of course resources, creating friction for analytics initiatives,” said Ankur Mehrotra, General Manager, AWS AI. “We are excited to have Trifacta as a data preparation partner to help our customers reduce the effort required to clean and prepare their data and spend more time creating intelligence.”
“Clean data is critical to any information-based organization,” said Charlotte Yarkoni, VP C+E Growth and Ecosystems, Microsoft. “But the process of cleaning and preparing data for use is time consuming and challenging. Trifacta, by leveraging Microsoft Azure big data and advanced analytics services, arms our shared customers with the ability to simplify those processes in order to more efficiently analyze the data and seek out meaningful insights.”
“Infosys helps organizations with their data and analytics challenges, transforming them into data-driven enterprises leveraging its Cobalt ecosystem – a set of services, solutions and platforms for enterprises to accelerate their cloud journey. Because client environments, data, and personnel can vary greatly, it is critical that the cloud data engineering platform underpinning our work be open and intelligent, while also enabling effective governance,” said Sunil Senan, SVP, Data and Analytics, Infosys. “Trifacta brings unique strengths to the Infosys Cobalt ecosystem and helps our clients modernize and transform their data and analytics capabilities.”
“Data preparation is an important part of the continuing trend toward end-user empowerment and self-service business intelligence, approaching the same level of importance given to reporting, dashboards, and data integration. Our research shows increasing investment in data preparation technology which provides a solid foundation for an information democracy initiative, bringing data engineers, data analysts, and business people together as one team,” said Howard Dresner, Founder and Chief Research Officer at Dresner Advisory. “Trifacta continues to show leadership in our annual Data Preparation and Data Pipelines and Integration market studies.”