GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
FIX_LATIN(1) User Contributed Perl Documentation FIX_LATIN(1)

fix_latin - filters a data stream that is predominantly utf8 and 'fixes' any latin (ie: non-ASCII 8 bit) characters

  fix_latin options <input_file >output_file

  Options:

   --use-xs <value> 'auto' | 'always' | 'never'
   --version        list version number
   --help           detailed help message

The script acts as a filter, taking source data which may contain a mix of ASCII, UTF8, ISO8859-1 and CP1252 characters, and producing output will be all ASCII/UTF8.

Multi-byte UTF8 characters will be passed through unchanged (although over-long UTF8 byte sequences will be converted to the shortest normal form). Single byte characters will be converted as follows:

  0x00 - 0x7F   ASCII - passed through unchanged
  0x80 - 0x9F   Converted to UTF8 using CP1252 mappings
  0xA0 - 0xFF   Converted to UTF8 using Latin-1 mappings

--use-xs 'auto' | 'always' | 'never'
Override default ('auto') behaviour of trying to use XS module and falling back to pure-Perl version if not available. Set to 'never' to always use the Perl version or 'always' to always use XS and die if not available.
--version (alias -v)
Display version number of underlying Encoding::FixLatin and XS modules.
--help (alias -?)
Display this documentation.

This script was originally written to assist in converting a Postgres database from SQL-ASCII encoding to UNICODE UTF8 encoding. The following examples illustrate its use in that context.

If you have a SQL format dump file that you would normally restore by piping into 'psql', you can simply filter the dump file through this script:

  fix_latin < dump_file | psql -d database

If you have a compressed dump file that you would normally restore using 'pg_restore', you can omit the '-d' option on pg_restore and pipe the resulting SQL through this script and into psql:

  pg_restore -O dump_file | fix_latin | psql -d database

To take a look at non-ASCII lines in the dump file:

  perl -ne '/^COPY (\S+)/ and $t = $1; print "$t:$_" if /[^\x00-\x7F]/' dump_file

This script is implemented using the Encoding::FixLatin Perl module. For more details see the module documentation with the command:

  perldoc Encoding::FixLatin

In particular you should read the 'LIMITATIONS' section to understand the circumstances under which data corruption might occur.

Copyright 2009-2014 Grant McLean "<grantm@cpan.org>"

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

2014-05-22 perl v5.32.1

Search for    or go to Top of page |  Section 1 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.