Use a set, not a list

Picking the right stop words is hard.

Once you have them, make using them fast.

Containment tests (in and not in) are fast on a set and slow on a list.

No matter how you get your stop words, always put them in a set.

Libraries that don’t give you a set…

NLTK

import nltk.corpus
stopwords = set(nltk.corpus.stopwords.words('english'))

stop-words

import stop_words
stopwords = set(stop_words.get_stop_words('english'))

Libraries that give you a set, making your code faster and cleaner…

spaCy

import spacy.lang.en
type(spacy.lang.en.STOP_WORDS) is set

scikit-learn

import sklearn.feature_extraction.stop_words
type(sklearn.feature_extraction.stop_words.ENGLISH_STOP_WORDS) is frozenset