Techniques

- Semantic similarity defines how close is the "meaning" between a set of text data. Lexical similarity defines how close is the word set used in the text
- To accurately capture the similarity between course descriptions semantic similarity is the superior method. Let's see a simple example explaining this.
- Consider the following two sentences:
- The tourism industry is collapsing
- Travel industry fears Covid-19 crisis will cause more holiday companies to collapse
- To compute semantic similarity, course descriptions are converted into vectors. Vectors facilitate the calculation of similarity using the technique of cosine similarity
- Cosine similarity calculates the angle between the two vectors and determines to what extent the vectors are pointing in roughly the same direction. It is often used to measure document similarity in text analysis
- But how shall we convert the sentences into vectors? That's when BERT comes into picture.
- BERT stands for Bidirectional Encoder Representations from Transformers. It is deep nuetral network model that has the power to encode the sentence along its meaning into a vector.
Follow
- © Untitled. All rights reserved
- Design: HTML5 UP