Data is spread over tens and sometimes hundreds of data silos. Your data workloads are increasing, with new formats, mostly unstructured, across clouds and on-premises systems. There are just too many tools to learn and move between. With all these challenges, AI projects eventually become data projects in disguise.
Google’s Data and AI Cloud lets you interconnect your data at multiple levels.
Interconnect structured and unstructured data - To unlock 360-degree insights into your business, you need to combine and analyze unstructured data, such as images, voice, and documents, with your structured data.
We launched the general availability of
BigLake Object Tables to help data users easily access, transverse, process and query unstructured data using SQL. We also launched support for the Hudi and Delta file format in BigLake, now generally available. Taking BigLake one step further, we launched the preview of fully managed
Iceberg tables in BigLake, so you can use high-throughput streaming ingestion for your data in Cloud Storage, get a fully managed experience with automatic storage optimizations for your Lakehouse, and perform DML transactions using BigLake to enable consistent modifications and improved data security, all while retaining full compatibility with the Iceberg reader.
BigLake has seen hyper-growth, with a 27x increase in BigLake usage since the beginning of the year.
Interconnect data across clouds - Many customers manage and analyze their data on Google Cloud, AWS or Azure with
BigQuery Omni, which provides a single pane of glass across clouds. Taking BigQuery Omni one step further, we added support for cross-cloud
materialized views, and
cross-cloud joins. We also extended analytics to on-prem data by bringing Dataproc Spark to Google Distributed Cloud. This allows you to run Spark on sensitive data in your data centers to support compliance or data sovereignty requirements and connect it with your BigQuery data in Google Cloud.
Interconnecting data management and governance - We added intelligent data profiling and data quality capabilities to help you understand the completeness, accuracy and validity of your data. We also launched extended data management and governance capabilities in
Dataplex. You get a single pane of glass for all your data and AI assets — including Vertex AI models and datasets, operational databases, and analytical data on Google Cloud and Omni.
Data sharing - In a given week, thousands of organizations share hundreds of petabytes of data across organizational boundaries using BigQuery. To further support interconnection of data, we launched
BigQuery data clean rooms to share and match datasets across companies and collaborate on analysis with trusted partners, all while respecting user privacy.
Cost optimization - Interconnecting all your data shouldn’t be expensive and unpredictable. So, we introduced
BigQuery pricing editions along with innovations for slots autoscaling and a new compressed storage billing model. BigQuery editions provide more choice and flexibility for you to select the right feature set for various workload requirements. You can mix and match among Standard, Enterprise, and Enterprise Plus editions to achieve the preferred price-performance by workload. BigQuery editions include the ability for single- or multi-year commitments at lower prices for predictable workloads, and new autoscaling capabilities that support unpredictable workloads by providing the option to pay only for the compute capacity you use.