multithreading - Writing a search engine -


The title may be a bit misleading, but I could not understand a better title. I'm writing a simple search engine that will search on a number of sites for specific domains. To be Concrete: I'm writing a search engine for hard work / follow-up / track tracks. To do this, I will search on sites that have live, track, and this kind of problem speed, I have to pass the search query to 5-7 sites, get results and then display results in a serial order. To do this, use your own algorithms. I could only do this "multilevel", but this is easy to say, so I have made some questions.

  1. What would be the best solution to this problem? Should I just add / process this application, so I have to get a little bit of speed?

  2. Is there any other solution or am I doing something wrong?

Thanks,

William van Dorn

Unless you are trying to learn multithreading, avoid writing this basic structure for yourself. Synchronizing many tasks, which can take different time, handling failures etc. This is a mess.

For roughly parallel tasks (like asking many sites, result combinations etc.), you want to look at existing infrastructure.

Mapping / reducing the framework (like Hadop for Java) can handle something for you, so that you can focus on the logic of your application.


Comments