![]() ![]() / in case of a DecoderFallbackException. / Starts with UTF-8 and switches to Default Var enc = base.CurrentEncoding as EncodingProvider / public override Encoding CurrentEncoding / /// public FlexiStreamReader( string path) :īase(path, new EncodingProvider(), detectEncodingFromByteOrderMarks: true) / capable to distinguish between UTF8 and Defaut encoding. / /// Initializes a new instance of /// class for the specified file name. / /// public FlexiStreamReader(Stream stream) :īase(stream, new EncodingProvider(), detectEncodingFromByteOrderMarks: true) / to distinguish between UTF8 and Defaut encoding. / /// Initializes a new instance of /// class for the specified stream. / public class FlexiStreamReader : StreamReader / /// StreamReader that is to some extend capable to detect the encoding of a stream. The stream remains unaffected, the position inside the stream does not need to be changed and therefore this will work for a forward-only steam as well.īecause the EncodingProvider is for a very special purpose and some methods are not even implemented, it should not be a public class. Instead I choose to create the FlexiStreamReader and make the EncodingProvider a private class: So when the StreamReader uses the EncodingProvider to read from a stream, it starts with UTF8 and as soon as the exception occurs switches to Default (Windows-1252). Inside these methods the DecoderFallbackException is handled and here the actual Encoding switches from UTF8 to Default. This class is derived from and because it is only used to get character from the byte stream, we only need to implement the methods GetCharCount, GetChars and GetMaxCharCount. Hence we need a new encoding for the StreamReader and this is the EncodingProvider: The original StreamReader does a pretty good job at choosing the right encoding based on a BOM, so we only need to deal with the case that a BOM is missing and the StreamReader uses the given encoding. However, this means the file or stream has to be read again and doing things again is something that should not be done!įinally, I came up with another solution. In case such an Exception is thrown, read the file again using ANSI encoding which is likely to be the right choice.Read a file with UTF8 encoding and catch the DecoderFallbackException.The problem is: How to Differentiate Between ANSI and UTF8 Without BOM This works fine for the first two cases, it fails when UTF8 without BOM is used. In any other case, Default encoding (here Windows-1252) will be used. If a BOM is available, the StreamReader will use it to get the correct encoding. Using ( var sr = new System.IO.StreamReader(path: path,
0 Comments
Leave a Reply. |