Hello,
I was wondering how I can search the whole corpus for a specific word and then display the result? How should specify that I want to search the whole corpus because currently it consists of many xml files? What language should I user?
Hi akurteva1516 welcome to the forum
You posted this in the JavaScript category, but considering the body “is a 100 million word collection” I can reasonably guarantee that JavaScript is not what you want to use. With that much data, even with a powerful server-side language it would stll be resource intensive and slow.
Even going file by file would not be a pleasant experience
size: 4049 files: c. 515 MB
Is there any particular reason why you don’t want to avail youself of one of the services that already has everything in a database and provides search functionality?
This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.