View Issue Details

IDProjectCategoryView StatusLast Update
0005423Composrdownloadspublic2024-08-13 00:44
ReporterdragomangkhnAssigned ToPatrick Schmalstig 
SeverityMinor-bug 
Status closedResolutionunable to reproduce 
Product Version10.0.43 
Fixed in Version 
Summary0005423: automatic tag word-breaks on tag lines for downloads
DescriptionThe tags created automatically according to the descriptions I enter for a download don't show letters/characters which aren't in English. Furthermore, it creates another line for the latter part of the word such as "s" "nav" for "sınav".
Steps To ReproduceAdd a download with descriptions in a language other than English (Turkish for my case) and after saving the download, you will see the tags only with English characters.
TagsNo tags attached.
Time estimation (hours)
Sponsorship open

Activities

dragomangkhn

2023-11-06 12:24

reporter   ~0008015

I would like to share an update on my side: when I enter the tags manually, there is no problem showing the characters.

Patrick Schmalstig

2023-11-06 16:36

administrator   ~0008018

Good to know. I can use deduction to determine it is probably a bug within the automatic SEO keyword extraction methods. Thank you.

Patrick Schmalstig

2023-11-11 18:53

administrator   ~0008020

Hello dragoman.

SEO keyword extraction is based on the language of your Composr site. If you are trying to use Turkish characters in English content, then the keyword extraction will remove them because they are not Latin characters.

In the test case you provided, were you running the site / download under English?

Guest

2023-11-12 15:19

viewer   ~0008026

I changed the site language yet it still couldn't get the characters without problems, so now altought the site language is English, I write the tags manually, and it is fixed.

dragomangkhn

2023-11-12 16:44

reporter   ~0008027

I am so sorry. I have just realised that I wrote the previous message on my mobile without logging in. The current situation is:
"I changed the site language, yet it still couldn't get the characters without problems, so now that the site language is English, I write the tags manually, and it is fixed."

Patrick Schmalstig

2023-11-12 19:42

administrator   ~0008028

No worries.

Can you check within your Composr installation for me... look under text (and text_custom) / the codename for the language you were trying to use. First, confirm with me if there is a word_characters.txt file in that directory. And if so, does it contain a list of all the characters for that language that are considered "word characters" (one per line)?

dragomangkhn

2023-11-15 23:36

reporter   ~0008047

There is only EN folder and only English letters are there, not the characters for the language I tried to use (in tags) in the word_characters.txt file (one per line).

Patrick Schmalstig

2023-11-18 02:03

administrator   ~0008051

That might be the problem.

You may need to install (or create) the appropriate language pack for Composr for your needs (I'm assuming Turkish). See https://compo.sr/docs10/tut-intl.htm .

There is a Turkish language pack available from Transifex.

When using a language pack, make sure text or text_custom has a folder for your language (create one if it does not). Then, make sure it has a word_characters.txt file which should contain a list of characters (one per line) which are used to form words in that language. Essentially, any time the automatic keyword extraction detects a character not in this list, it considers that character as a word break. Ideally, you should also have too_common_words.txt to define words that should never be used as SEO keywords (for example, in English, you wouldn't ever use "the" as a keyword). And finally, you may wish to have a synonyms.txt file to define synonyms (a group of synonyms per line... each word separated by a tab. See the one in EN for an example.)

Please let me know if you need any further assistance.

Issue History

Date Modified Username Field Change
2023-10-23 12:06 dragomangkhn New Issue
2023-11-06 12:24 dragomangkhn Note Added: 0008015
2023-11-06 16:36 Patrick Schmalstig Note Added: 0008018
2023-11-11 18:53 Patrick Schmalstig Note Added: 0008020
2023-11-12 15:19 Guest Note Added: 0008026
2023-11-12 16:44 dragomangkhn Note Added: 0008027
2023-11-12 19:42 Patrick Schmalstig Note Added: 0008028
2023-11-15 23:36 dragomangkhn Note Added: 0008047
2023-11-18 02:03 Patrick Schmalstig Note Added: 0008051
2024-08-13 00:44 Patrick Schmalstig Assigned To => Patrick Schmalstig
2024-08-13 00:44 Patrick Schmalstig Status non-assigned => closed
2024-08-13 00:44 Patrick Schmalstig Resolution open => unable to reproduce