About URL's, an introduction.
What does URL stand for?
It's sounds techno but it's actually quite simple - URL stands for "Uniform Resource Locator", some folks say "Unique Resource Location" or to most of us it just means a "link".
Why are URL's important to the web?
Web pages must be linked to allow readers to move from one page to another, if you enter a word into an internet search engine such as google or MSN the website will do its clever stuff in the background and then show you a page with a list of links, hopefully relating to the search word you entered. These links are URL's. Most websites contain at least a few pages, links allow you to "navigate" your way from one to another.
The internet is held together by URL's
I've seen Page Not Found errors, what's this about?
Each link is like an address, it points to a very specific place on the internet in much the same way your postal address lets the mail service know how to deliver a letter to your door. I've chosen my words carefully here (i.e. place rather than page) you see a link might not actually point to a web page, it might just as easily point to a graphic file or text document - anything you can keep on a computer hard disk you can link to a web page!
IMPORTANT: a Web Page IS a FILE.
A "Page Not Found" error means the page (FILE) has either been moved to another place OR it's been deleted OR most likely, the URL you followed is simply wrong.
How can URL's be wrong?
URL's or links are easy to get wrong, even professionals get confused or make simple errors that have BIG effects, for instance:
If you click on this link your browser may warn you about a POP UP window, it's okay, this just then shows a picture:
http://www.famouswelsh.com/images/feature_articles/URLHelp/redcar.jpg
If you click on this link you should get an error message because the link is wrong:
http://www.famouswelsh.com/images/feature_articles/URLHelp/REDCAR.jpg
So, what's the difference?
The first link points to an image file (redcar.jpg) saved on the famouswelsh.com computer, the second points to the same image but notice it's written as (REDCAR.JPG) the only difference is the first is written in small characters the second in CAPITALS. On the internet "this" is different to "THIS".
A file called "redcar.jpg" is different to a file called "REDCAR.JPG", it's what known as case sensitive. So you can see how easy it is to make a mistake especially when some computers (Windows) can not tell the difference between the names.
What's the difference between a web page and a file?
There is no difference, a web page IS a file. Web pages are just text documents created in such a way that your web browser (Internet Explorer and Firefox are just two examples) can interpret instructions in the text and produce a page like the one you're reading right now!
On the internet the important thing to remember is what a file is called, not what it says it is. Look at the top of your browser window, the title of the page appears in the space above the "File Edit - whatever" menus - This page you're reading right now is called "About URL's, an introduction" BUT the file name is actually something quite different: "1900-url-introduction" - the proper URL for this page is:
"http://www.famouswelsh.com/component/content/article/23-this-website/1900-url-introduction"
So how does it all go together?
One of the most confusing issues we have to deal with is file naming. The following diagram might help explain. It depicts a simple website about "A Red Car" - as you can see there is a home page and four linked pages along with a picture of a car. Now, a computer hard disk is a little like a filing cabinet, let me explain...
The linking method we've descibed here is called absolute linking. There's another way of linking files called 'relative linking' - With relative links you don't need to specify the whole address, just the 'route' from the top folder of your website directory so the picture of the car above could have been written like this: '/picture.jpg'. If we'd chosen to place all of our images in a folder, say 'images' we'd specify it like this: /images/picture.jpg. The /images/ is a folder.
The "Red Car" website begins with its home page with links to four subsequent pages and a picture of a red car. Each page and the picture are files saved to the web server - hold your mouse over a page to see its file name.
The home page can be titled anything, whatever is relevant to topic of the website - the same is true for all the other pages on the site although...
IMPORTANT: The title of a page and what its file name is are different things! A link connects to the file name and must be exact, watch for upper case letters when you type! One strict rule, for nearly all websites the home page file MUST be named index.htm(.html or xhtm, php, etcl)!
A filing cabinet contains, files! - your computer hard disk contains, er - files!
Everything is put away in order, even if it doesn't make sense to us your computer knows where all the information is whether it's a text file, an email, a program/system file or an image file, like the scan of the car in the example above.
Every FILE must have a name, two files with the same name in the same place (in the same folder) is not allowed. Just to make things worse, every file should have what is known as an extension - this is normally how a computer recognises what type of file it is. Strictly speaking, Unix and Mac systems don't HAVE to have a file extension but it's common practice and essential with web building.
There are lots of different file types but we can simplify matters by thinking of them as two main groupings like so:
- Program/System files
- These files make your computer work, this is what we refer to as "software".
- Documents
- These files are created with software, like the scan of my car. Each document has a "file format", obviously a name and its "extension" which is usually assigned by the software you use to create it. In the case of this web page its name is "URLFAQ" and its extension is ".html", a type of html text file. The image of the car was saved as a file named "fto_01" and its extension is ".jpg".
We could go on for ever about file types, there are thousands of them but you'll find some very common types that are used everywhere, such as:
- .jpg (or .jpeg) An image file type common on the internet
- .png An image file type common on the internet
- .gif An image file type common on the internet
- .tif An image file type used by printers
- .pdf A file that looks like a printed page
- .htm A web page, a type of text file
- .html (.shtml, .xml) A web page, a type of text file
- .txt (.text) A plain text file, very basic but good
- .doc A word processing document
- .mpg (.mpeg) A movie file
- .mp3 An audio file
- .wav An audio file
A web page IS A FILE In fact it's a type of text file
An image is A FILE, from a digital camera or a scan - whatever, it's a file!
A program like Internet Explorer is a collection of files that work together - for now we can just think of these as "system files".
What about the http:// bit, what's that ?
For most users, these days your web browser will automatically fill in the http:// part of the URL - when the internet everyone would enter that part as part of the URL. The http:// specifies the protocol of the connection you want to make to the file server - in this case, a web server. A protocol is a little like a language, typing http:// is like telling your computer to contact the internet using the language http://.
There are other protocols, lots in fact! - You might see from time to time the term "ftp" used - it's just another type of language - modern browsers are quite clever, they speak a number of languages http and ftp are just two!
Why do I need the www part?
Strictly speaking, from a technical point of view there is no need to have www within a URL - but you'll find, in most cases if you leave it off you'll get an error message.
A little demonstration
Instead of typing www.google.com into your web browser, try typing just ww.google.com or even leave the www part off altogether! - You'll see that the google search engine website still appears, however, many other websites would not. The www part of the URL is there to confirm to internet rules. The website owners can choose to take it away or change it altogether!
What about the .com or .net part?
This is what's known as the domain. It's beyond the scope of this document to explain how the domain system works suffice to say, each domain is unique, two identical domains can not exist on the internet at the same time.
A domain can be thought of as a "place" on the internet where files are stored.
If you need to see a page on a particular website you need its domain to find it just like the postal service needs your address to deliver a letter to your door. Just to confuse matters, having gone to lengths to explain how a file saved in capital letters is different to one in lowercase letters - this is not necessarily true of domains. Most domains are specified in lowercase but quite often they're mixed so www.redcar.com could be found by typing www.RedCar.com, as long as the spelling is correct it will work.
Hope this helps, there's loads of information on the internet about how to save files, link pages and pretty well every other issue you could think of connected with computers. If you have any questions you can always contact us at famouswelsh.com