• Jon © (08.07.10 20:59) [0]
    I'm not sure if I am using this correctly or if it is a bug. This is what I am doing:


    program Test;

    uses KOL;

    var StrList: PKOLStrList;

    begin
     StrList := NewKOLStrList;
     StrList.LoadFromFile('Test.txt');
     MsgOK(StrList.Text);
     StrList.Free;
    end.



    I have ASCII text file names Text.txt with these contents:


    1 This is line one
    2 This is line two
    3 This is line three
    4 This is line four
    5 This is line five



    Without UNICODE_CTRLS  it works fine.
    With UNICODE_CTRLS I get garbage.

    My understanding is that PKOLString can be used in both scenarios. Am I wrong?
  • Vladimir Kladov © (08.07.10 21:49) [1]
    In both only if for UNICODE_CTRLS text is coded as UNICODE in file and as ANSI for not UNICODE_CTRLS. If text is coded the same way, use corresponding StrList or WStrList. To check if text is unicode, open it as a file, read first two bytes, check if it contains $FEFF. This way is not applicable if the file was saved using non-windows text editor (without notepad). For this case try open it as a long string and find there some $00 bytes. At least CR/LF characers in UNICODE16 are coded like $13, $00 and $10, $00.
  • Jon © (08.07.10 23:26) [2]
    Thank you for the explanation, it makes sense. I was under the understanding that KOL checked for that itself internally - I see that I am mistaken now. Is there a routine built within KOL that would determine (or give a best guess) if a string is Unicode or non-ANSI?
  • Vladimir Kladov © (09.07.10 04:48) [3]
    This depends on a file creator, so there is no sense to do it automatically. You may compare it with VCL or other languages. File may be coded very different ways, and even unix-like text file is not detected automatically, you should suppose it at windows/dos or handle its format additionally.

    At least I suggested check procedure in my previous post. It can be completed with other checks if you want but all depends on your task and possible input data.
  • Jon © (09.07.10 05:14) [4]
    Thank you. I shall do as you advised and will check the input first as per your recommendations.

    I do think KOL would benefit from a built-in routine anyway - there is IsNAN so why not add IsUnicode, IsWide, Is...? It's just a suggestion to improve the library features.
  • Vladimir Kladov © (09.07.10 10:58) [5]
    Not a problem: if a universal method exists which can distinct unicode text from non-unicode in most cases. Do you know such method? I don't. (And I think there is no such method, otherwise my lovely browser would not contain options to switch decoding for a web-page in the menu).
  • Jon © (09.07.10 22:11) [6]
    OK, I see your point. I shall investigate available methods and I may suggest suitable routines. Probably an "IsASCII" that checks for non-ASCII characters may be best.
Есть новые Нет новых   [120347   +17][b:0][p:0.001]