|
|
| |
PT-SLAVE-RESTART(1) |
User Contributed Perl Documentation |
PT-SLAVE-RESTART(1) |
pt-slave-restart - Watch and restart MySQL replication after errors.
Usage: pt-slave-restart [OPTIONS] [DSN]
pt-slave-restart watches one or more MySQL replication slaves for
errors, and tries to restart replication if it stops.
Percona Toolkit is mature, proven in the real world, and well tested, but all
database tools can pose a risk to the system and the database server. Before
using this tool, please:
- Read the tool's documentation
- Review the tool's known "BUGS"
- Test the tool on a non-production server
- Backup your production server and verify the backups
pt-slave-restart watches one or more MySQL replication slaves and tries to skip
statements that cause errors. It polls slaves intelligently with an
exponentially varying sleep time. You can specify errors to skip and run the
slaves until a certain binlog position.
Although this tool can help a slave advance past errors, you
should not rely on it to "fix" replication. If slave errors occur
frequently or unexpectedly, you should identify and fix the root cause.
pt-slave-restart prints a line every time it sees the slave has an error. By
default this line is: a timestamp, connection information, relay_log_file,
relay_log_pos, and last_errno. You can add more information using the
"--verbose" option. You can suppress all output using the
"--quiet" option.
pt-slave-restart sleeps intelligently between polling the slave. The current
sleep time varies.
- The initial sleep time is given by "--sleep".
- If it checks and finds an error, it halves the previous sleep time.
- If it finds no error, it doubles the previous sleep time.
- The sleep time is bounded below by "--min-sleep" and above by
"--max-sleep".
- Immediately after finding an error, pt-slave-restart assumes another error
is very likely to happen next, so it sleeps the current sleep time or the
initial sleep time, whichever is less.
As of Percona Toolkit 2.2.8, pt-slave-restart supports Global Transaction IDs
introduced in MySQL 5.6.5. It's important to keep in mind that:
- pt-slave-restart will not skip transactions when multiple replication
threads are being used (slave_parallel_workers > 0). pt-slave-restart
does not know what the GTID event is of the failed transaction of a
specific slave thread.
- The default behavior is to skip the next transaction from the slave's
master. Writes can originate on different servers, each with their own
UUID.
See "--master-uuid".
An exit status of 0 (sometimes also called a return value or return code)
indicates success. Any other value represents the exit status of the Perl
process itself, or of the last forked process that exited if there were
multiple servers to monitor.
pt-slave-restart should work on many versions of MySQL. Lettercase of many
output columns from SHOW SLAVE STATUS has changed over time, so it treats them
all as lowercase.
This tool accepts additional command-line arguments. Refer to the
"SYNOPSIS" and usage information for details.
- --always
- Start slaves even when there is no error. With this option enabled,
pt-slave-restart will not let you stop the slave manually if you want
to!
- --ask-pass
- Prompt for a password when connecting to MySQL.
- --charset
- short form: -A; type: string
Default character set. If the value is utf8, sets Perl's
binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to
DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other
value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES
after connecting to MySQL.
- --[no]check-relay-log
- default: yes
Check the last relay log file and position before checking for
slave errors.
By default pt-slave-restart will not doing anything (it will
just sleep) if neither the relay log file nor the relay log position
have changed since the last check. This prevents infinite loops (i.e.
restarting the same error in the same relay log file at the same relay
log position).
For certain slave errors, however, this check needs to be
disabled by specifying
"--no-check-relay-log". Do not do this
unless you know what you are doing!
- --config
- type: Array
Read this comma-separated list of config files; if specified,
this must be the first option on the command line.
- --daemonize
- Fork to the background and detach from the shell. POSIX operating systems
only.
- --database
- short form: -D; type: string
Database to use.
- --defaults-file
- short form: -F; type: string
Only read mysql options from the given file. You must give an
absolute pathname.
- --error-length
- type: int
Max length of error message to print. When
"--verbose" is set high enough to print the error, this option
will truncate the error text to the specified length. This can be useful
to prevent wrapping on the terminal.
- --error-numbers
- type: hash
Only restart this comma-separated list of errors. Makes
pt-slave-restart only try to restart if the error number is in this
comma-separated list of errors. If it sees an error not in the list, it
will exit.
The error number is in the
"last_errno" column of
"SHOW SLAVE STATUS".
- --error-text
- type: string
Only restart errors that match this pattern. A Perl regular
expression against which the error text, if any, is matched. If the
error text exists and matches, pt-slave-restart will try to restart the
slave. If it exists but doesn't match, pt-slave-restart will exit.
The error text is in the
"last_error" column of
"SHOW SLAVE STATUS".
- --help
- Show help and exit.
- --host
- short form: -h; type: string
Connect to host.
- --log
- type: string
Print all output to this file when daemonized.
- --max-sleep
- type: float; default: 64
Maximum sleep seconds.
The maximum time pt-slave-restart will sleep before polling
the slave again. This is also the time that pt-slave-restart will wait
for all other running instances to quit if both "--stop" and
"--monitor" are specified.
See "SLEEP".
- --min-sleep
- type: float; default: 0.015625
The minimum time pt-slave-restart will sleep before polling
the slave again. See "SLEEP".
- --monitor
- Whether to monitor the slave (default). Unless you specify --monitor
explicitly, "--stop" will disable it.
- --password
- short form: -p; type: string
Password to use when connecting. If password contains commas
they must be escaped with a backslash: "exam\,ple"
- --pid
- type: string
Create the given PID file. The tool won't start if the PID
file already exists and the PID it contains is different than the
current PID. However, if the PID file exists and the PID it contains is
no longer running, the tool will overwrite the PID file with the current
PID. The PID file is removed automatically when the tool exits.
- --port
- short form: -P; type: int
Port number to use for connection.
- --quiet
- short form: -q
Suppresses normal output (disables "--verbose").
- --recurse
- type: int; default: 0
Watch slaves of the specified server, up to the specified
number of servers deep in the hierarchy. The default depth of 0 means
"just watch the slave specified."
pt-slave-restart examines "SHOW
PROCESSLIST" and tries to determine which connections are
from slaves, then connect to them. See
"--recursion-method".
Recursion works by finding all slaves when the program starts,
then watching them. If there is more than one slave,
"pt-slave-restart" uses
"fork()" to monitor them.
This also works if you have configured your slaves to show up
in "SHOW SLAVE
HOSTS". The minimal configuration for this
is the "report_host" parameter, but
there are other "report" parameters as well for the port,
username, and password.
- --recursion-method
- type: array; default: processlist,hosts
Preferred recursion method used to find slaves.
Possible methods are:
METHOD USES
=========== ==================
processlist SHOW PROCESSLIST
hosts SHOW SLAVE HOSTS
none Do not find slaves
The processlist method is preferred because SHOW SLAVE HOSTS
is not reliable. However, the hosts method is required if the server
uses a non-standard port (not 3306). Usually pt-slave-restart does the
right thing and finds the slaves, but you may give a preferred method
and it will be used first. If it doesn't find any slaves, the other
methods will be tried.
- --run-time
- type: time
Time to run before exiting. Causes pt-slave-restart to stop
after the specified time has elapsed. Optional suffix: s=seconds,
m=minutes, h=hours, d=days; if no suffix, s is used.
- --sentinel
- type: string; default: /tmp/pt-slave-restart-sentinel
Exit if this file exists.
- --slave-user
- type: string
Sets the user to be used to connect to the slaves. This
parameter allows you to have a different user with less privileges on
the slaves but that user must exist on all slaves.
- --slave-password
- type: string
Sets the password to be used to connect to the slaves. It can
be used with --slave-user and the password for the user must be the same
on all slaves.
- --set-vars
- type: Array
Set the MySQL variables in this comma-separated list of
"variable=value" pairs.
By default, the tool sets:
wait_timeout=10000
Variables specified on the command line override these
defaults. For example, specifying "--set-vars
wait_timeout=500" overrides the defaultvalue of
10000.
The tool prints a warning and continues if a variable cannot
be set.
- --skip-count
- type: int; default: 1
Number of statements to skip when restarting the slave.
- --master-uuid
- type: string
When using GTID, an empty transaction should be created in
order to skip it. If writes are coming from different nodes in the
replication tree above, it is not possible to know which event from
which UUID to skip.
By default, transactions from the slave's master
('Master_UUID' from "SHOW
SLAVE STATUS") are skipped.
For example, with
master1 -> slave1 -> slave2
When skipping events on slave2 that were written to master1,
you must specify the UUID of master1, else the tool will use the UUID of
slave1 by default.
See "GLOBAL TRANSACTION IDS".
- --sleep
- type: int; default: 1
Initial sleep seconds between checking the slave.
See "SLEEP".
- --socket
- short form: -S; type: string
Socket file to use for connection.
- --stop
- Stop running instances by creating the sentinel file.
Causes "pt-slave-restart" to
create the sentinel file specified by "--sentinel". This
should have the effect of stopping all running instances which are
watching the same sentinel file. If "--monitor" isn't
specified, "pt-slave-restart" will
exit after creating the file. If it is specified,
"pt-slave-restart" will wait the
interval given by "--max-sleep", then remove the file and
continue working.
You might find this handy to stop cron jobs gracefully if
necessary, or to replace one running instance with another. For example,
if you want to stop and restart
"pt-slave-restart" every hour (just to
make sure that it is restarted every hour, in case of a server crash or
some other problem), you could use a
"crontab" line like this:
0 * * * * pt-slave-restart --monitor --stop --sentinel /tmp/pt-slave-restartup
The non-default "--sentinel" will make sure the
hourly "cron" job stops only instances
previously started with the same options (that is, from the same
"cron" job).
See also "--sentinel".
- --until-master
- type: string
Run until this master log file and position. Start the slave,
and retry if it fails, until it reaches the given replication
coordinates. The coordinates are the logfile and position on the master,
given by relay_master_log_file, exec_master_log_pos. The argument must
be in the format "file,pos". Separate the filename and
position with a single comma and no space.
This will also cause an UNTIL clause to be given to START
SLAVE.
After reaching this point, the slave should be stopped and
pt-slave-restart will exit.
- --until-relay
- type: string
Run until this relay log file and position. Like
"--until-master", but in the slave's relay logs instead. The
coordinates are given by relay_log_file, relay_log_pos.
- --user
- short form: -u; type: string
User for login if not current user.
- --verbose
- short form: -v; cumulative: yes; default: 1
Adds more information to the output. This flag can be
specified multiple times. e.g. -v -v OR -vv. By default (no verbose
flag) the tool outputs connection information, a timestamp,
relay_log_file, relay_log_pos, and last_errno. One flag (-v) adds
last_error. See also "--error-length". Two flags (-vv) prints
the current sleep time each time pt-slave-restart sleeps. To suppress
all output use the "--quiet" option.
- --version
- Show version and exit.
- --[no]version-check
- default: yes
Check for the latest version of Percona Toolkit, MySQL, and
other programs.
This is a standard "check for updates automatically"
feature, with two additional features. First, the tool checks its own
version and also the versions of the following software: operating
system, Percona Monitoring and Management (PMM), MySQL, Perl, MySQL
driver for Perl (DBD::mysql), and Percona Toolkit. Second, it checks for
and warns about versions with known problems. For example, MySQL 5.5.25
had a critical bug and was re-released as 5.5.25a.
A secure connection to Percona’s Version Check database
server is done to perform these checks. Each request is logged by the
server, including software version numbers and unique ID of the checked
system. The ID is generated by the Percona Toolkit installation script
or when the Version Check database call is done for the first time.
Any updates or known problems are printed to STDOUT before the
tool's normal output. This feature should never interfere with the
normal operation of the tool.
For more information, visit
<https://www.percona.com/doc/percona-toolkit/LATEST/version-check.html>.
Show version and exit.
These DSN options are used to create a DSN. Each option is given like
"option=value". The options are
case-sensitive, so P and p are not the same option. There cannot be whitespace
before or after the "=" and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See
the percona-toolkit manpage for full details.
- A
dsn: charset; copy: yes
Default character set.
- D
dsn: database; copy: yes
Default database.
- F
dsn: mysql_read_default_file; copy: yes
Only read default options from the given file
- h
dsn: host; copy: yes
Connect to host.
- p
dsn: password; copy: yes
Password to use when connecting. If password contains commas
they must be escaped with a backslash: "exam\,ple"
- P
dsn: port; copy: yes
Port number to use for connection.
- S
dsn: mysql_socket; copy: yes
Socket file to use for connection.
- u
dsn: user; copy: yes
User for login if not current user.
The environment variable "PTDEBUG" enables
verbose debugging output to STDERR. To enable debugging and capture all output
to a file, run the tool like:
PTDEBUG=1 pt-slave-restart ... > FILE 2>&1
Be careful: debugging output is voluminous and can generate
several megabytes of output.
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
installed in any reasonably new version of Perl.
For a list of known bugs, see
<http://www.percona.com/bugs/pt-slave-restart>.
Please report bugs at
<https://jira.percona.com/projects/PT>. Include the following
information in your bug report:
- Complete command-line used to run the tool
- Tool "--version"
- MySQL version of all servers involved
- Output from the tool including STDERR
- Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with
"PTDEBUG"; see
"ENVIRONMENT".
Visit <http://www.percona.com/software/percona-toolkit/> to download the
latest release of Percona Toolkit. Or, get the latest release from the command
line:
wget percona.com/get/percona-toolkit.tar.gz
wget percona.com/get/percona-toolkit.rpm
wget percona.com/get/percona-toolkit.deb
You can also get individual tools from the latest release:
wget percona.com/get/TOOL
Replace "TOOL" with the name of
any tool.
This tool is part of Percona Toolkit, a collection of advanced command-line
tools for MySQL developed by Percona. Percona Toolkit was forked from two
projects in June, 2011: Maatkit and Aspersa. Those projects were created by
Baron Schwartz and primarily developed by him and Daniel Nichter. Visit
<http://www.percona.com/software/> to learn about other free,
open-source software from Percona.
This program is copyright 2011-2018 Percona LLC and/or its affiliates, 2007-2011
Baron Schwartz.
THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as published by
the Free Software Foundation, version 2; OR the Perl Artistic License. On
UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic'
to read these licenses.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
Hey! The above document had some coding errors, which are explained
below:
- Around line 5994:
- Non-ASCII character seen before =encoding in 'Percona’s'. Assuming
UTF-8
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |