Friday, December 4, 2009

Why do certain urls not have the .htm extension?

why do certain urls not have the .htm extension?



for example this url has no .htm / .html extenion :



http://en.wikipedia.org/wiki/Binary_file



and this one does:



http://www.geocities.com/html_4u/index.h...



also how can i make it so that the web pages i make do NOT have the .htm / .html extension?



any help would be appreciated.



Why do certain urls not have the .htm extension?



When you see a file extension (.htm, .html, .php, .asp, etc.), those URLs are referencing actual files on the web server. The extension tells the server how to process the file. For example, a .asp file will get run through an ASP (Active Server Pages) interpreter to process the VBScript code (which is the scripting language for ASP) before it sends back the HTML.



The reason why you don't see the file extension on some URLs is that web sites are usually configured to display certain files by default if one is not specified, using a list called an "Order of Precedence." When a browser makes an HTTP request to site and the file is not specified in the URL, the server will scan down the list until it finds a file from the list. It will then return that file after doing any server side processing. For example, my web host will display a file called index.asp if a visitor does not specify a file in the address. So typing http://ron-and-iris.geefamily.net gives you the same results as typing http://ron-and-iris.geefamily.net/index....



In fact, in your example above, Geocities will display the same page if you take off the index.html, because index.html is an entry in the list. If it doesn't find either of those, or any other file with a name on it's "Order of Precedence," then the web server will display an error or redirect to a default "404" page (file not found).



So, to make a web page without displaying a file extension, you first have to find out from your web host what the "Order of Precedence" is. Usually default.htm, default.html, index.htm and index.html are safe bets. If you're hosted on a Linux server, index.php is probably an additional default name that you can use. On a Windows host that supports ASP, index.asp can also be a default name. Note, though, that if you want to NOT display the extension for multiple pages, then you'll have to start creating subfolders, each with their own index.htm file (or whatever name you decide on from the "Order of Precedence"). If you create a file, then, with one of those names, it will automatically be displayed when a user types in the address without specifying the file name.



Why do certain urls not have the .htm extension?opera mobile opera theater



Because they aren't HTML? Some pages are meant to be parsed by PHP, for example.



Just change your Apache configuration to handle pages as required.



http://httpd.apache.org/docs/2.2/
because html means codes that make certain images and symbols that you can't get just by typing! it depends
Used to be required to to have a .html or .htm for a website...



But now with PHP, Flash etc... there is no need...



Apache is a web server...
If you look up the definition of URL, you see that it indicates the location of a resource. Not a file. Not directory. Not anything but a resource. Maybe the resource is a file, but who cares. You want a webpage. You don鈥檛 care whether it is HTML, came from a database, used PHP, or whatever. It has no business being part of the URL.



More importantly, non-trivial webpages are resources that aren鈥檛 single files. They may be retrieved from a memory cache or generated on the fly from programming code and database lookups. Why give them a .htm extension? There鈥檚 no rule saying they must have one.



It鈥檚 very easy to map a URL to a filesystem directory. Example.com/mysite/mypage.html literally maps to a directory mysite, where mypage.html is stored. But in non-trivial sites, this mapping is non-existent. Example.com/complexsite/cars may be a page that is compiled on the fly from multiple database lookups, a few memory cache lookups, various design templates, program code that comes from multiple tiers, and so on. There is no directory and no single html page.



Are you running your own webserver? Can you configure it? If you cannot, you will not be able to do this kind of arbitrary mapping. Either you rewrite the URLs, which requires configuring the server, or you set a handler to the requested resources. For example, some of my sites have nothing in the webroot directory. They are handled by server code, which appropriately resolves the resources. If you have programmed a site like that, then you will have such URLs.

No comments:

Post a Comment

 
skin allergy