Marty Anstey Logo
 About Me   Hobbies   Business   Programming   Photos   Projects   More... 

Search Engine

One of my longest running projects and certainly one of the most expensive and time consuming has been the search engine. This is a project that I began in the late 1990's and was transformed into a corporation during the dot-com era that eventually went public before being sold. Unfortunately, I received very little from the run-up and sale of the company mostly due to the economic factors of the time and the business practices of the companies I worked with. A hard lesson learned, but it allowed me to take a break and re-focus on the search engine itself and the technology surrounding it.

In 2002 and 2003, I made great strides towards building a brand-new search engine, crawler, software suite and hardware from scratch, and without even taking a glance at the old code. The search engine environment had changed completely since 1998, and success meant adapting to the new environment. New search engines (primarily Google, but also others including Teoma, WiseNut and GigaBlast) had upped the ante considerably, and adapting to this new world required a large amount of planning and insight. I began the project in several phases, to span several years to accomodate my available time and budget. In late 2003, I completed the first revision of the new engine, and in early 2004 I began work on the new crawler. Some issues surfaced in this department, and I ended up spending nearly six months arranging internet connectivity to accomodate the new crawler and it's insatiable bandwidth needs.

In late 2004, I resumed the task of programming the crawler and moving the search engine into Phase II. The previous prototypes of the engine and crawler had allowed me to identify significant areas where improvement was needed. At the time, I had been really stretching myself thin with too many projects, and so I decided to take a new strategy with time management and focus on one project at a time. In August of 2004 I decided to focus on the WeatherCity project (www.weathercity.com), a global weather forecasting system and website, which was completed in January, 2005. In February, I resumed the Search Engine project beginning with the new crawler. I had assembled a new crawl platform which had cost a considerable amount, and with bandwidth no longer an issue, testing and development went ahead at a much quicker pace than ever before.

That brings us to the present timeframe; development is going ahead quite well. Substantial work during 2005 led to a development of my latest crawler, Vortex. During a crawl over the 2005 Christmas holidays (a logical choice since total internet traffic is generally slightly lower during that time period), I succeeded in downloading about 100 million documents, of which was pared down considerably during a procedure to remove junk webpages. I am expecting deployment of the latest engine at some point this year. My journal details day-to-day accounts of my progress - among other things - so check it out frequently. If you have any questions about the engine, I'm always glad to answer questions. Visit the contact page to send me your questions or comments.




  :: News
 January 1 2010
Wow, is it 2010 already?
 January 1 2009
Welcome to 2009!
 July 29 2008
If you're into folk-rock music, check out the latest album by my dad, R.G. Anstey
 January 1 2008
Happy new year!
  :: Features
  :: Links
  :: Search the Site

Home - Writing/Poetry - Programming - Projects - Music - Travel - Guest Book - Calendar
Business Ventures - PHP Scripts - Web Spiders - Search Engine - Links - Contact Me

Constructed entirely by hand using only TextPad and PhotoShop
Modified Tuesday June 13, 2006 - 19:49 UTC

(C) Copyright 2000-2010 Marty Anstey ~~ I didn't rip you off, so don't rip me off.