我如何在asp.net中搜索我网站的html代码???

我如何在asp.net中搜索我网站的html代码???

问题描述:

我该如何搜索html或Google搜索之类的网站页面文本????
我使用searchDotNet.dll不能在我的网站上工作!但是它可以按照我的需要工作,
请帮助我!

how can i search in html or text of my website pages like google search ???
i use searchDotNet.dll it does not work in my website!but it is working like my need,
please help me!!

最明显的方法是使用google.除此之外,HTML不以任何形式存在,以供您的C#代码查看,因此您实际上不能这样做.当然,您拥有的任何文本都可以根据您想要的任何方式进行搜索,具体取决于其来源.将其放入数据库是一个很好的第一步.
The obvious way is to use google. Beyond that, the html does not exist in any form for your C# code to look at, so you really can''t. Of course, any text you have, you can search any way you want, depending on where it comes from. Putting it in a database is a good first step.


我想我了解您想要的内容,并且目前的方向是错误的.我想您当前的想法是从还托管该网站的服务器的角度搜索您的网页.出于安全原因和一些实际原因,这很难做到,因为例如您要为数据库建立索引,并且还必须弄清楚该数据属于哪个页面.这很难维护,因为您需要大量有关页面的结构信息.

那怎么办呢?搜索引擎会抓取您的网页,然后从那里去.他们解释html并寻找链接.网页上的特定术语使用将某些术语连接在一起的功能进行链接.例如,您有两个页面,其中包含一些简单的文本:
page1.asp
这是一个简单的测试文本.
page2.asp
该测试页面很简单.

假设搜寻器可以找到两个页面.搜索索引器将使用"simple"之类的单词并将其连接到page1.asp,稍后再对page2.asp进行此操作.在page1.asp中为术语测试"创建了一个稍微复杂一些的链接.搜索索引器必须足够聪明,才能将page2.asp上的测试"链接到page1.asp上的测试",还必须将测试"链接回page2.asp上的测试".这将双向创建链接,并使搜索引擎更有用,因为不需要精确指定术语即可找到所需内容.

此外,您还可以定义所使用的搜索索引器功能以及此功能的执行程度.例如,您可以添加有关测试"和page2.asp之间的链接的信息,该信息指示在此页面上找不到该术语.当然,可以在Internet上的任何地方找到更多信息.

最重要的是使搜索引擎(搜索索引器)与您的网页分开,并使其独立运行.这也使它更加灵活并且易于重用.另外,别忘了索引网站是一个持续的过程,索引器必须定期检查更改并进行自我更新以确保它保持最新.

祝你好运!
I think I understand what you want and are currently looking in the wrong direction. I guess your current idea is to search your web page from the point of view of the server that also hosts the website. For security reasons and some practical reasons this is very hard to do because you would for example be indexing your database and must also figure out to which page this data belongs. This very hard to maintain because you need a lot of structural info about the pages.

How to do it then? Search engines crawl your webpage and go from there. They interpret the html and look for links to follow. Specific terms on the webpage are linked using a function that connects certain terms together. For example, you have two pages with some simple text:
page1.asp
This is a simple text for testing.
page2.asp
This test page is simple.

Assume that the crawler can find both pages. The search indexer would take a word like "simple" and connect it to page1.asp, and later on this would be done for page2.asp also. A somewhat more complex link is created for the term "testing" on page1.asp. The search indexer must be smart enough to link "test" on page2.asp to "testing" on page1.asp and also "testing" back to "test" on page2.asp. This would create a link both ways and makes the search engine more usefull because it is not needed to specify terms precisely to find what you want.

Furthermore it is up to you to define the search indexer function you use and how smart this is done. For example, you could add info about the link between "testing" and page2.asp that indicates the that this term is not found literal on this page. More info can be found everywhere on the internet of course.

The most important is to keep the search engine (search indexer) apart from your webpage and let this operate on its own. This makes it also more flexible and easy to reuse. Also, don''t forget that indexing the site is a constant process and the indexer must check on a regular basis for changes and update itself to ensure it stays up to date.

Good luck!