Vertex AI - Antrophic and Mistral models: Why does it require Imegen access?

Cosine similarity between three text files

Written by News One November 11, 2024

I have three .txt files that contain text (they are novels). I need to compute the cosine similarity between the three texts and then produce a multi-dimensional graph that places the 3 texts in relation to each other based on cosine sim scores. The final output should be the graph. This is what my text looks like. I got to the point of getting the tokens, but not as far as the cosine scores. Once I run

from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer()

dtm = vectorizer.fit_transform(tokens)
vocab = vectorizer.get_feature_names_out()
`matrix = dtm.toarray()

my kernel restars and I have to run everything again. I would appreciate a thorough explanation from the top because I feel maybe I didn’t start out right.

Source link

Leave a Reply Cancel reply