问题
In my blog app, a user can enter any text as a title for their entry and then I generate a URL based on the text.
I validate their title to make sure it only contains letters and numbers.
If they enter something like
Lorem 3 ipsum dolor sit amet
how could I generate the more SEO friendly version of this text:
Lorem-3-ipsum-dolor-sit-amet
回答1:
It's in practice really not as simple as replacing spaces by hypens. You would often also like to make it all lowercase and normalize/replace diacritics, like á, ö, è and so on which are invalid URL characters. The only valid characters are listed as "Unreserved characters" in the 2nd table of this Wikipedia page.
Here's how such a function can look like:
public static String prettyURL(String string) {
return Normalizer.normalize(string.toLowerCase(), Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "")
.replaceAll("[^\\p{Alnum}]+", "-");
}
It does basically the following:
- lowercase the string
- remove combining diacritical marks (after the Normalizer has "extracted" them from the actual chars)
- replace non-alphanumeric characters by hyphens
See also:
- JSP 2.0 SEO friendly links encoding
回答2:
String s = "Lorem 3 ipsum dolor sit amet"
s = s.replaceAll(" ","-");
回答3:
Since it won't seem to allow me to comment. I would do:
String s = "Lorem 3 ipsum dolor sit amet"
s = s.replaceAll(" ","_");
Using the Underscore character instead because it is a space indicator. Its been a while since I've done java but I know there is a function in .Net that will cleanup a file name so its safe for the file system. I lot of the same general rules applies to a URL so if you can find one in the API it be worth taking a look.
来源:https://stackoverflow.com/questions/3658991/how-to-translate-lorem-3-ipsum-dolor-sit-amet-into-seo-friendly-lorem-3-ipsum