Einstein Text Clustering

Bringing Data-Driven Analysis and Insights to Salesforce with Einstein Discovery Text Clustering

Einstein Discovery revolutionizes your approach to predictive analytics, allowing you to effortlessly build reliable machine learning models without any coding.

Text Clustering: Unlock Hidden Insights from Unstructured Data

Text clustering simplifies unstructured data by extracting its top keywords, allowing you to quickly uncover hidden insights and make informed decisions.

Required Editions

Salesforce Classic and Lightning Experience: Supported in both interfaces.
CRM Analytics: Available at an additional cost in Enterprise, Performance, and Unlimited Editions. Also available in Developer Edition.

Note: Einstein Discovery stories are now referred to as models. While the updated terminology is rolling out, you may still encounter the old name in some areas until the transition is complete.

How to Use Text Clustering

Access Model Settings:
- Select a variable containing unstructured data, such as customer comments or feedback.
Apply Text Clustering:
- From the Transform dropdown menu, choose Text Clustering.
![Text Clustering Option]
Train the Model:
- Click Train Model to initiate the process. Once completed, return to Model Settings to explore the results.
Review Keyword Clusters:
- Examine the identified keyword clusters displayed in a table and visual chart for better understanding.
![Text Clustering Results]
Search for Specific Keywords:
- Use the search bar to locate specific keyword instances within your data.

Text clustering enables businesses to organize unstructured data efficiently, providing clarity and actionable insights that can drive smarter decision-making.

Enhance Predictive Models with Text Clustering in Einstein Discovery

Supervised machine learning models often rely on tabular datasets consisting of numerical, categorical, and temporal variables. However, additional value can be unlocked by incorporating insights from unstructured text data. Examples of unstructured text include:

Salesforce records: Notes from opportunities, service cases, activities, etc.
Customer feedback: Surveys, product reviews, or complaints.
Meeting notes and product descriptions.
Digital communication: Emails, social media posts, live chat logs, and chatbot data.

It is estimated that 80% of business data is unstructured. While this figure is just an approximation, it highlights the wealth of untapped information available in text data. Harnessing this data with machine learning opens doors to enhanced business insights and improved outcomes.

Introducing Text Clustering in Einstein Discovery

Previously, unstructured text data required external preprocessing before it could be used in Einstein Discovery predictive models. With the Summer ’22 Release, Salesforce introduced a Text Clustering feature, enabling Einstein Discovery to seamlessly analyze and incorporate unstructured text.

This functionality not only improves model accuracy but also provides new insights to drive actionable business decisions and streamline workflows.

How Text Clustering Works

Algorithms Behind Text Clustering

Two core algorithms power text clustering in Einstein Discovery:

TF-IDF (Term Frequency-Inverse Document Frequency):
- Term Frequency (TF): Measures how often a word appears in a document relative to the total number of words in that document.
- Inverse Document Frequency (IDF): Identifies how unique a word is across the entire dataset by penalizing terms that appear too frequently in all documents.
- TF-IDF assigns scores between 0 and 1 to each word, prioritizing terms that are frequent within specific documents but rare across the dataset.
K-Means Clustering:
- Once TF-IDF scores are calculated, K-Means groups the top 75 most significant terms into 10 clusters (K=10).
- From these clusters, the top 3 terms per cluster are selected for use in the predictive model, helping identify the text most relevant to the predicted outcomes.

Salesforce handles the technical complexities of these algorithms, allowing users to focus on insights without delving into the math.

Using Text Clustering in Einstein Discovery: A Step-by-Step Guide

Step 1: Create a Dataset

Begin with a dataset that includes a column of unstructured text. This text can come from Salesforce data, external sources, or combined datasets.

For example, imagine you are a wine distributor analyzing a dataset of wine reviews to predict the average price of wine sold.

Step 2: Create an Einstein Discovery Story

Navigate to Einstein Discovery and create a new story.
Choose Insight & Predictions and select Manual Configuration to include specific columns in the model.
The unstructured text column (e.g., “Customer Review”) will initially appear unavailable.

Step 3: Apply the Text Clustering Transformation

In the Story Settings, click the text column (e.g., “Customer Review”) to reveal the Transform dropdown menu.
Select Text Clustering.
Proceed to create the story. Einstein Discovery will analyze the text and add the derived clusters to your model.

Step 4: Explore Story Insights

After the story is complete:

Review the Insights Page, where text clusters and their associated keywords are displayed as variables.
Visualize the top 10 clusters, each with 3 key terms, to understand how unstructured text impacts your predicted outcome.

Step 5: Leverage Predictions in Your Workflow

Text clusters become part of the predictive model:

Model Predictions: Text clusters influencing predictions are highlighted in the Prediction Examination section.
Einstein Discovery Component: Deploy predictions directly into Salesforce using the Einstein Discovery Lightning Web Component. This ensures actionable insights for each record.

Example Walkthrough: Predicting Wine Prices

Scenario: A wine distributor uses Einstein Discovery to analyze customer wine reviews.

Dataset: Contains structured data (e.g., wine variety, price) and unstructured text (e.g., reviews).
Goal: Predict the average price of wine sold and use customer reviews to uncover factors driving pricing decisions.

Through Text Clustering:

Customer Review text is transformed into clusters, identifying key terms like “oak,” “fruity,” or “acidity.”
These insights help the distributor target marketing efforts or adjust pricing strategies based on customer sentiment.

Benefits of Text Clustering in Einstein Discovery

Improved Model Accuracy: Adds context and nuance to predictions using unstructured data.
Enhanced Insights: Identifies patterns in text that influence outcomes, offering deeper understanding.
Actionable Predictions: Enables tailored interventions directly within Salesforce workflows.

By leveraging Text Clustering, organizations can transform unstructured data into meaningful insights, driving smarter decisions and greater business impact.

To utilize Einstein Discovery Text Clustering, your organization needs the appropriate license, with user accounts assigned relevant permissions.

Salesforce AI provides a reliable and scalable AI experience embedded in the foundation of their Einstein Platform.

Reveal untapped insights from your unstructured data with text clustering. Fine-tune prediction accuracy with trending data

Ready for a Einstein Discovery Text Clustering Implementation?

Explore the possibilities of Einstein Discovery Analysis or other Salesforce solutions based on your company’s goals.

Although Salesforce Tectonic offers numerous benefits, companies may face challenges during integration, such as aligning it with existing systems and ensuring proper training for employees to maximize its use.

Tectonic is ready to partner with you, offering support at every stage of your Salesforce journey. Reach out online to request more information or schedule a call – we look forward to discussing your needs!