Your best friend for file transfer.

Fetch application logoFetch

Problem with apostrophes and hyphens (5 posts)

  • Started 14 years ago by Gary Greene
  • Latest reply 14 years ago from Gary Greene
  • Gary Greene Member

    I'm trying to update a number of pages, adding a link. When I do, even though I have not touched any of the text on the page, any word with an apostrophe, or any hyphen, becomes corrupted onscreen. This happens with 2 diff. HTML editors, so I think it is Fetch that is causing my problem.

    The only solution I have found is to reopen the page in my editor, delete and re-type the word, re-upload, and voila! - problem fixed. This is time consuming and tedious however.

    Using a Mac and Snow Leopard and either Taco HTML edit or Espresso. Any suggestions?

    Edited 14 years ago #

  • Scott McGuire Administrator

    Hi,

    Which version of Fetch are you using?

    And you are saying that after you upload the webpage and look at it in your web browser, that is when you first see the problem with apostrophes and hyphens?

    Thanks,

    Scott McGuire
    Fetch Softworks

    Posted 14 years ago #

  • Gary Greene Member

    Hi Scott!

    I'm using Fetch 5.5.3. When I edit the page, I don't see the problem in the editor. After I upload the page, then view it 'live' on the web, then I see that the apostrophes and hyphens have been corrupted. Here is a page I did yesterday that I haven't fixed yet:

    www.nationalfiber.com/genInt2.htm

    Any help is appreciated. The folks at Taco HTML software say they've never run into this problem before. Lucky me!
    :-)

    Posted 14 years ago #

  • Scott McGuire Administrator

    Hi,

    Thanks for the additional information, and the link to the uncorrected page.

    To begin with, Fetch 5 does not make changes to your files when you upload them, so that is not what is going on.

    This is a problem of character encoding (also known as "text encoding") mismatch.

    Text files, including HTML files, can be saved using a variety of different character sets - that is, a system of specifying which numerical codes indicates which character. The set of numerical codes used is called a character encoding. While most character encodings use the same codes for letters and digits and common punctuation, they often use different numerical codes to represent special characters such as accented characters, curly quotes, and hyphens/dashes (such as you have). If your file uses one encoding, but something tries to display it using a different encoding, you end up with some characters displaying as gibberish.

    Also, web servers don't know which character encodings text files use unless you tell them. If you don't tell the web server which encoding your page uses, it will assume they use a certain character encoding, and if that isn't the encoding you used, your pages will display incorrectly.

    It looks like one part of the problem is that your uncorrected pages are using Unicode as the character encoding, but your server defaults to assuming displaying pages should be displayed with the "Western (ISO Latin 1)" encoding

    You can verify that this is the case in Safari by going to the View menu, choosing the Text Encoding submenu, and choosing "Unicode (UTF-8)." You'll see that your webpage is redrawn and the characters display correctly. That's good, but not very helpful, because you don't want everyone to have to manually choose the encoding to see your pages correctly. (You haven't actually made a change to the webpage by choosing this option, you've only changed how Safari is displaying it on your computer.)

    The second part of the problem is that your webpages don't include any information that tells web browsers which character set the page is using, which is why they end up being displayed using the server's default. (A properly configured server will display a webpage with the encoding the page says it uses, if one is specified.)

    Usually the encoding is specified by a line in the <head> section of your HTML file, and it looks something like this (this is not necessarily the line you would want to use):

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    

    My guess is that your pages were originally written using the Unicode character encoding, but your editor is set to default to Western (ISO Latin 1) for further editing, so when you delete and re-type everything, you are changing the problem characters' encoding, so that is why when you do that and re-upload the files, they display correctly - even though the files still don't have an HTML line specifying the character set used by the page, after you've deleted and retyped everything, the character encoding matches the server's default encoding, so everything looks okay.

    You should check your text editor to see which character encoding it's set to use when you open one of your pages. To fix the problem, one option would be to change the text editor use to Unicode (UTF-8) if it is not set to that already. If you do that, you will need to be sure to add the HTML line specifying the character encoding the file uses so that it displays properly in browsers. Or another option is that your text editor should be able to convert the whole file from one encoding to another, so that you could just convert the files to Western (ISO Latin 1) without you needing to retype everything; I am not familiar with Taco HTML Editor so I can't tell you how to do that. Then when you uploaded the file it would match the server's default encoding and would display properly.

    Anyway, this all needs to be done in your text editor, or by adding the HTML line to specify the character encoding used by the files; as I said before, Fetch does not change either the text or character encoding of files when they are uploaded.

    I hope this all makes sense; I realize it's a bit complicated if you're not familiar with it. Please let us know if you have further questions.

    Thanks,

    Scott McGuire
    Fetch Softworks

    Edited 14 years ago #

  • Gary Greene Member

    Thanks! That's great information! I REALLY appreciate it.

    Regards,

    Gary

    Posted 14 years ago #

Reply

  • Or nickname, if you prefer.
  • This will be kept confidential.
  • This is to ensure that you’re a person, not a spambot.