Data Enrichment
Data enrichment is a critical step in the machine learning process that involves enhancing raw data by adding external or derived information to it. This additional information provides context, accuracy, and depth to your dataset, helping your machine learning models generate better predictions and uncover new insights. By enriching your data, you improve the quality and relevance of the inputs used to train your models, which can lead to more accurate outcomes, deeper insights, and the ability to handle complex business problems more effectively.
Enrichment Types Overview:
Gather from Public Web:
What It Does: This option pulls additional data from publicly available web sources to supplement your existing dataset.
Use Case: Automatically fetch weather data based on the date and location of a claim, providing additional context for understanding weather-related damage.
Purpose: Provides external data that may impact your predictions, offering richer context for the model to make more accurate decisions.
Use Generative AI:
What It Does: Uses generative AI to create synthetic or inferred data based on existing inputs.
Use Case: Automatically generate descriptions for damage reports or missing data points, improving the model's understanding of the input data.
Purpose: Fills gaps in your data by generating probable values, helping your model learn from a more comprehensive dataset.
Perform Calculations:
What It Does: Automatically calculates or derives new fields from the data you’ve provided, based on predefined formulas or rules.
Use Case: Estimate repair costs for a claim based on the severity of the damage and other factors.
Purpose: Saves time by automating repetitive calculations, ensuring that key metrics are included in your dataset for better decision-making.
Retrieve from Private Datasets:
What It Does: Integrates data from your private or proprietary datasets to enhance the context and relevance of your model’s input.
Use Case: Use historical claim data from your own database to provide additional insight into how similar past claims were processed and resolved.
Purpose: Adds valuable internal data, allowing your model to benefit from your own business intelligence and improve accuracy.
Application Flow for Configuring Input Data Enrichment:
Navigate to the Data Enrichment Tab:
In your dashboard, click on the "Data Enrichment" tab to access the enrichment configuration options.
Select Schema Fields for Enrichment:
Choose the fields from your input schema that you want to enrich. You can map one or more fields to the enrichment options below.
Enable Enrichment Options:
For each field, you can enable different types of enrichment depending on your needs:
Gather from Public Web:
Check the box for "Public Web Enrichment" to pull additional data from publicly available sources.
Use Case Example: Fetch weather data for the date of a property damage claim to correlate it with the damage type.
Use Generative AI:
Select "Generative AI Enrichment" to allow the system to generate synthetic data to augment the dataset.
Use Case Example: Generate possible descriptions of damage based on claim descriptions to improve model training.
Perform Calculations:
Choose "Calculation Enrichment" to derive new fields or perform automatic calculations from existing data.
Use Case Example: Automatically calculate the estimated repair cost based on damage severity levels.
Retrieve from Private Datasets:
Enable "Private Dataset Enrichment" to integrate data from your proprietary datasets.
Use Case Example: Use historical claim data stored in your private datasets to enhance predictive accuracy.
Review and Save:
After configuring the enrichment options for your schema fields, review the selections and click "Save" to apply the enrichment settings.
By completing this step, your data will be automatically enriched based on the inputs and enrichment types you selected, ensuring higher-quality and contextually rich datasets for model training and processing. This enriched data enhances the insights generated by the platform, helping your machine learning models produce more accurate and actionable results.
Last updated