Web Trawling or Deep web search
Web Trawling or Deep web search
A friend asked me how do I keep in touch with the latest and greatest? The answer was “/.”
But seriously if I am looking for something specific, the following bookmark is what I use for my searches, i.e. when I want more then then what google can give me(!).
| PDF search |
|
||||||||||
| Articles and journals archive |
|
||||||||||
| Yet to be investigated |
|
||||||||||
| Bookmarks |
|
||||||||||
| Research area on CS |
|
Other links
Dictionary of Algorithms
An open directory
Hard selling or phising
A call on my mobile phone on 14 January, 9:41 AM IST from +91-80-5119300 went
on the folowing lines:
Caller: Hello, Am I speaking to Satish?
Me: Yes.
Caller: Hello sir. I am calling from Hutch. We are providing an email
intimation service. Please give me your email ID.
Me: I haven’t asked for this service.
Caller: Yes sir. You have been selected for the email bill notification
service.
Me: I am not interested.
Caller: Thank you sir.
hangs up.
What’s wrong with the above call? Something was not right.
First, someone calls me up and asks for my email ID without explaining to
me why they want it. Second, no indication of the privacy policy and if they
are going to use it for further marketting or sell it to spam people!
Is this marketting gimmick or email harvesting?
What got my goat was the subtle way of ensuring that they get your email ID. If
you notice, the first step is to ask for the emailID. It’s your responsibility
to ask the right questions before divulging the answers.
How different is this from phising?
What saddend me in the whole process?
Am I getting a cynical view of the world? Do I see schemes under everything?
Have I started to read too much in my day to day setup?
My natural reaction on the call was very paranoid and defensive. Moment I heard
Hutch, I was on A-Alert mode.
I am not sure if I like progress/capitalism anymore
. This guy sums up my feeling to perfection!
Draw your mug
Warning> Took me around 10 mins to finish my first attempt.
Cool web based application where an attempt can be made to draw your own face.
My first attempt 
Another shot at it 
You too can try it here.
How nerdy am I?
I always believed I was normal, but this guy wants to prove otherwise ![]()
Previously seen here.
Indian Ocean Tsunami
PM Relief Fund
If you are in India, you can write a checque/DD to:
Prime Minister’s National Relief Fund
Address: Prime Minister’s Office, South Block, New Delhi-110001.
You can make a difference
Donate here:
Oxfarm
Sarvodaya, NGO in Sri Lanka
Prime Minister Of India’s Relief Fund
Aid India
Other links
SEA-EAT blog
Google page
Amateur Seismic Centre in Pune
More Links
Wikipedia
Report on damages
Enabling compression in Web Servers
Enabling compression in Web Servers
On a site serving static content or content that are read only, generally does
not generate large data to be consumed by the client. Even in case of large
documents/articles/content most of the sites split the content across multiple
pages to ease the amount of data to be downloaded. In case of applications with
web interfaces this can be different. Generally the users of the system have
been used to a client-server based applications with UI interface with many
smart tricks on the client side. A typical user expects the same behaviour on
the web solution. A clever UI designer can design an interface which works well
even in the web based mode.
But there are many cases where a small thin interface is difficult to design.
Standard examples are dashboard, reporting page or cases where filters are
removed, etc …
For these cases it is a good idea to enable stream compression at the web server
level. This can drastically reduce the bandwidth requirement and hence download
speeds for a site.
Few gotchas:
-
Enabling compression will increase the server CPU load. But in a cluster
enviroment, I definetly recommend this. As a general thumb rule if the server
machine is running at the max load of 80% enable the compression, otherwise add
another machine to your cluster environment before enabling them. -
There are browser quirks where these can create problem. Most notable Netscape
4.xx (these can be safely ignored) or IE 5.5 (yikes, baad browser. If your
users have this browser, recommend IE 5.5 SP2) or IE 6.0 Japanese version (similar to IE
5.5 in bugs!) -
Compressing streams smaller then 1K will create unnecessary load on the server.
If possible disable compression for streams smaller then ~500-1000 bytes.
There is a decent article covering the same here. I will not cover the same points already discussed out there, but these are my take:
- Set compression for both static and dynamic files.
-
Disable compression for those aspx or server script files where we serve
already compressed files. e.g. jpeg, mpeg, dwf, jpg, jpe, gif, zip, cab, mpg,
mpe, mp3, png, amp, ptp, dwp, pnp, and zgl. -
Understand the following variables of IIS:
HcDoDynamicCompression,
HcNoCompressionForRange, In fact readup on
IIsCompressionSchemes. -
To disable compression for certain files or nodes, set the DoDynamicCompression
to false.
e.g. scripts:
[code]
C:/Inetpub/AdminScripts/adsutil.vbs set w3svc/{siteID}/root/DoStaticCompression True
cscript C:/Inetpub/AdminScripts/adsutil.vbs set w3svc/{siteID}/root/DoDynamicCompression True
cscript C:/Inetpub/AdminScripts/adsutil.vbs set
W3SVC/{siteID}/root/{subfolder}/{page.aspx}/DoDynamicCompression False
[/code]
Otherwise you can use the Metabase Explorer to do this.
Apache web server
There are two modules available, namely “deflate” and “gzip”.
As the following thread explains if you have Apache 2, use mod_deflate otherwise depending upon your configuration requirements go for either.
The apache site contains information for deflate (http://httpd.apache.org/docs-2.0/mod/mod_deflate.html)
and mod_gzip is explained at http://www.schroepl.net/projekte/mod_gzip/config.htm
A sample configuration
[code]
mod_gzip_on Yes
mod_gzip_on Yes
#mod_gzip_send_vary Yes
mod_gzip_add_header_count Yes
mod_gzip_dechunk Yes
mod_gzip_can_negotiate Yes
mod_gzip_update_static No
mod_gzip_static_suffix .gz
mod_gzip_minimum_file_size 300
mod_gzip_maximum_inmem_size 60000
mod_gzip_maximum_file_size 2000000
mod_gzip_temp_dir /tmp
mod_gzip_keep_workfiles No
## minimal included set of items to compress to avoid sending Vary * header
## This is very conservative and cooperates superbly with mod_expires
## caching headers. (Netscape 4.0[678] will still have problems, but it
## only affects a fraction of a percent of hits (about 0.00015 == 0.015%)
## on our site with the settings below)
mod_gzip_item_include uri .s?html?$
mod_gzip_item_include mime ^text/
mod_gzip_item_include file .php$
mod_gzip_item_include file .js$
mod_gzip_item_include file .css$
mod_gzip_item_exclude mime ^image/
[/code]