Tải bản đầy đủ
Chapter 20. Starting and Stopping MongoDB

Chapter 20. Starting and Stopping MongoDB

Tải bản đầy đủ

--port

Specify the port number for the server to listen on. By default, mongod uses port
27017, which is unlikely to be used by another process (besides other mongod pro‐
cesses). If you would like to run more than one mongod process on a single machine,
you’ll need to specify different ports for each one. If you try to start mongod on a
port that is already being used, it will give an error:
"Address already in use for socket: 0.0.0.0:27017"

--fork

Fork the server process, running MongoDB as a daemon.
If you are starting up mongod for the first time (with an empty data directory), it
can take the filesystem a few minutes to allocate database files. The parent process
will not return from forking until the preallocation is done and mongod is ready
to start accepting connections. Thus, fork may appear to hang. You can tail the log
to see what it is doing. You must use --logpath if you specify --fork.
--logpath

This option sends all output to the specified file rather than outputting on the
command line. This will create the file if it does not exist, assuming you have write
permissions to the directory. It will also overwrite the log file if it already exists,
erasing any older log entries. If you’d like to keep old logs around, use the -logappend option in addition to --logpath (highly recommended).
--directoryperdb

This puts each database in its own directory. This allows you to mount different
databases on different disks, if necessary or desired. Common uses for this are
putting a local database (replication) on its own disk or moving a database to a
different disk if the original one fills up. You could also put databases that handle
more load on faster disks and lower on slower. It basically gives you more flexibility
to move things around later.
--config

Use a configuration file for additional options not specified on the command line.
This is typically used to make sure options are the same between restarts. See “FileBased Configuration” on page 336 for details.
For example, to start the server as a daemon listening on port 5586 and sending all
output to mongodb.log, we could run this:
$ ./mongod --port 5586 --fork --logpath mongodb.log --logappend
forked process: 45082
all output going to: mongodb.log

Note that mongod may decide to preallocate journal files before it considers itself “start‐
ed.” If it does, fork will not return to the command prompt until the preallocation has

334

|

Chapter 20: Starting and Stopping MongoDB

finished. You can tail mongodb.log (or wherever you redirected the log file) to watch its
progress.
When you first install and start MongoDB, it is a good idea to look at the log. This might
be an easy thing to miss, especially if MongoDB is being started from an init script, but
the log often contains important warnings that prevent later errors from occurring. If
you don’t see any warnings in the MongoDB log on startup, then you are all set. (Startup
warnings will also appear on shell startup.)
If there are any warnings in the startup banner, take note of them. MongoDB will warn
you about a variety of issues: that you’re running on a 32-bit machine (which MongoDB
is not designed for), that you have NUMA enabled (which can slow your application to
a crawl), or that your system does not allow enough open file descriptors (MongoDB
uses a lot of file descriptors).
The log preamble won’t change when you restart the database, so feel free to run
MongoDB from an init script and ignore the logs, once you know what they say. How‐
ever, its a good idea to check again each time you do an install, upgrade, or recover from
a crash, just to make sure MongoDB and your system are on the same page.
When you start the database, MongoDB will write a document to the local.startup_log
collection that describes the version of MongoDB, underlying system, and flags used:
> db.startup_log.findOne()
{
"_id" : "spock-1360621972547",
"hostname" : "spock",
"startTime" : ISODate("2013-02-11T22:32:52Z"),
"startTimeLocal" : "Mon Feb 11 17:32:52.547",
"cmdLine" : {

}

},
"pid" : 28243,
"buildinfo" : {
"version" : "2.4.0-rc1-pre-",
...
"versionArray" : [
2,
4,
0,
-9
],
"javascriptEngine" : "V8",
"bits" : 64,
"debug" : false,
"maxBsonObjectSize" : 16777216
}

This collection can be useful for tracking upgrades and changes in behavior.
Starting from the Command Line

|

335

File-Based Configuration
MongoDB supports reading configuration information from a file. This can be useful
if you have a large set of options you want to use or are automating the task of starting up
MongoDB. To tell the server to get options from a configuration file, use the -f or -config flags. For example, run mongod --config ~/.mongodb.conf to use ~/.mon
godb.conf as a configuration file.
The options supported in a configuration file are exactly the same as those accepted at
the command line. Here’s an example configuration file:
# Start MongoDB as a daemon on port 5586
port = 5586
fork = true # daemonize it!
logpath = /var/log/mongodb.log
logappend = true

This configuration file specifies the same options we used earlier when starting with
regular command-line arguments. It also highlights most of the interesting aspects of
MongoDB configuration files:
• Any text on a line that follows the # character is ignored as a comment
• The syntax for specifying options is option = value, where option is case-sensitive
• For command-line switches like --fork, the value true should be used

Stopping MongoDB
Being able to safely stop a running MongoDB server is at least as important as being
able to start one. There are a couple of different options for doing this effectively.
The cleanest way to shut down a running server is to use the shutdown command,
{"shutdown" : 1}. This is an admin command and must be run on the admin database.
The shell features a helper function to make this easier:
> use admin
switched to db admin
> db.shutdownServer()
server should be down...

The shutdown command, when run on a primary, steps down the primary and waits
for a secondary to catch up before shutting down the server. This minimizes the chance
of rollback, but the shutdown isn’t guaranteed to succeed. If there is no secondary avail‐
able that can catch up within a few seconds, the shutdown command will fail and the
(former) primary will not shut down:

336

|

Chapter 20: Starting and Stopping MongoDB

> db.shutdownServer()
{
"closest" : NumberLong(1349465327),
"difference" : NumberLong(20),
"errmsg" : "no secondaries within 10 seconds of my optime",
"ok" : 0
}

You can force the shutdown command to shutdown a primary by using the force option:
db.adminCommand({"shutdown" : 1, "force" : true})

This is equivalent to sending a SIGINT or SIGTERM signal (all three of these options
result in a clean shutdown, but there may be unreplicated data). If the server is running
as the foreground process in a terminal, a SIGINT can be sent by pressing Ctrl-C.
Otherwise, a command like kill can be used to send the signal. If mongod has 10014 as
its PID, the command would be kill -2 10014 (SIGINT) or kill 10014 (SIGTERM).
When mongod receives a SIGINT or SIGTERM, it will do a clean shutdown. This means
it will wait for any running operations or file preallocations to finish (this could take a
moment), close all open connections, flush all data to disk, and halt.

Security
Do not set up publicly addressable MongoDB servers. You should restrict access as
tightly as possible between the outside world and MongoDB. The best way to do this is
to set up firewalls and only allow MongoDB to be reachable on internal network ad‐
dresses. Chapter 23 covers what connections are necessary to allow between MongoDB
servers and clients.
Beyond firewalls, there are a few options you can add to your config file to make it more
secure:
--bind_ip

Specify the interfaces that you want MongoDB to listen on. Generally you want this
to be an internal IP: something application servers and other members of your
cluster can access but is inaccessible to the outside world. localhost is fine for mon
gos processes if you’re running the application server on the same machine. For
config servers and shards, they’ll need to be addressable from other machines, so
stick with non-localhost addresses.
--nohttpinterface

By default, MongoDB starts a tiny HTTP server on a port 1000 above wherever you
started MongoDB. This gives you some information about your system, but nothing
you can’t get elsewhere and is somewhat useless on a machine you probably only
access via SSH and exposes information that should be inaccessable to the outside
world.

Security

|

337

Unless you’re in development, this should be turned off.
--nounixsocket

If you’re not planning to connect via file system socket, you might as well disable
this option. You would only connect via file system socket on a machine that is also
running an application server: you must be local to use a file system socket.
--noscripting

This entirely disallows server-side JavaScript execution. Most security issues that
have been reported with MongoDB have been JavaScript-related and it is generally
safer to disallow it, if your application allows.
Several shell helpers assume that JavaScript is available on the server, notably
sh.status(). You will see errors if you attempt to run any of these helpers with
JavaScript disabled.
Do not enable the REST interface. It is disabled by default and allows running many
commands on the server. It is not intended for production use.

Data Encryption
As of this writing, MongoDB provides no built-in mechanism for encrypting data stor‐
ed. If you require data to be encrypted, use filesystem encryption. Another possibility
is manually encrypting certain fields (although MongoDB has no special ability to query
for encrypted values).

SSL Connections
By default, connections to MongoDB transfer data unencrypted. However, SSL con‐
nection support is available. Due to licensing issues the default builds do not have SSL,
but you can download a subscriber build at http://www.10gen.com, which supports SSL.
You can also compile MongoDB from source to enable SSL support. Consult your driv‐
er’s documentation on how to create SSL connections using your language.

Logging
By default, mongod sends its logs to stdout. Most init scripts use the --logpath option
to send logs to a file. If you have multiple MongoDB instances on a single machine (say,
a mongod and a mongos), make sure that their logs are stored in separate files. Make
sure that you know where the logs are and have read access to the files.
MongoDB spits out a lot of log messages, but please do not run with the --quiet option
(which suppresses some of them). Leaving the log level at the default is usually perfect:
there is enough info for basic debugging (why is this slow, why isn’t this starting up,

338

|

Chapter 20: Starting and Stopping MongoDB

etc.), but the log does not take up too much space. If you are debugging a specific issue
with your application, there are a couple options for getting more info from the logs.
First, you can change the log level, either by restarting MongoDB with more v’s or
running the setParameter command:
> db.adminCommand({"setParameter" : 1, "logLevel" : 3})

Remember to turn log level back down to 0, or your logs may be needlessly noisy. You
can turn log level up to 5, at which point mongod will print out almost every action it
takes, including the contents of every request handled. This can cause a lot of IO as
mongod writes everything to the log file, which can slow down a busy system. Turning
on profiling is a better option if you need to see every operation as it’s happening.
By default, MongoDB logs information about queries that take longer than 100 ms to
run. If 100 ms it too short or too long for your application, you can change the threshold
with setProfilingLevel:
>
>
{
>
{

// Only log queries that take longer than 500ms
db.setProfilingLevel(1, 500)
"was" : 0, "slowms" : 100, "ok" : 1 }
db.setProfilingLevel(0)
"was" : 1, "slowms" : 500, "ok" : 1 }

The second line will turn off profiling, but the value in milliseconds given in the first
line will continue to be used as a threshold for the log (across all databases). You can
also set this parameter by restarting MongoDB with the --slowms option.
Finally, set up a cron job that rotates your log every day or week. If MongoDB was started
with --logpath, sending the process a SIGUSR1 signal will make it rotate the log. There
is also a logRotate command that does the same thing:
> db.adminCommand({"logRotate" : 1})

You cannot rotate logs if MongoDB was not started with --logpath.

Logging

|

339

CHAPTER 21

Monitoring MongoDB

Before you deploy, it is important to set up some type of monitoring. Monitoring should
allow you to track what your server is doing and alert you if something goes wrong.
This chapter will cover:
• How to track MongoDB’s memory usage
• How to track application performance metrics
• How to diagnose replication issues
Examples use chapters from the Mongo Monitoring Service (MMS) to demonstrate what
to look for when monitoring. There are installation instructions for MMS at https://
mms.10gen.com. If you do not want to use MMS, please use some type of monitoring.
It will help you detect potential issues before they cause problems and let you diagnose
issues when they occur.

Monitoring Memory Usage
Accessing data in memory is fast and accessing data on disk is slow. Unfortunately,
memory is expensive (and disk is cheap) and typically MongoDB uses up memory
before any other resource. This section covers how to monitor MongoDB’s interactions
with disk and memory, and what to watch for.

Introduction to Computer Memory
Computers tend to have a small amount of fast-to-access memory and a large amount
of slow-to-access disk. When you request a page of data that is stored on disk (and not
yet in memory), your system page faults and copies the page from disk into memory. It
can then access the page in memory extremely quickly. If your program stops regularly

341

using the page and your memory fills up with other pages, the old page will be evicted
from memory and only live on disk again.
Copying a page from disk into memory takes a lot longer than reading a page from
memory. Thus, the less MongoDB has to copy data from disk, the better. If MongoDB
can operate almost entirely in memory, it will be able to access data much faster. Thus,
MongoDB’s memory usage is one of the most important stats to track.

Tracking Memory Usage
There are several “types” of memory MongoDB reports using. First is resident memo‐
ry: this is the memory that MongoDB explicitly owns in RAM. For example, if we query
for a document and it is paged into memory, that page is added to MongoDB’s resident
memory.
MongoDB is given an address for that page. This address isn’t the literal address of the
page in RAM. It’s a virtual address. MongoDB can pass it to the kernel and the kernel
will look up where the page really lives. This way, if the kernel needs to evict the page
from memory, MongoDB can still use the address to access it. MongoDB will request
the memory from the kernel, the kernel will look at its page cache, see that the page is
not there, page fault to copy the page into memory, and return it to MongoDB. The
pages of data MongoDB has addresses for is how MongoDB’s mapped memory is cal‐
culated: it includes all of the data MongoDB has ever accessed. It will usually be about
the size of your data set.
MongoDB keeps an extra virtual address for each page of mapped memory for jour‐
naling to use (see Chapter 19). This doesn’t mean that there are two copies of the data
in memory, just two addresses. Thus, the total virtual memory MongoDB uses will be
approximately twice your mapped memory size (or twice your data size). If journaling
is disabled, mapped and virtual memory sizes will be approximately equal.
Note that both virtual memory and mapped memory are not “real” memory allocations:
they do not tell you anything about how much RAM is being used. They are just map‐
pings that MongoDB is keeping. Theoretically, MongoDB could have a petabyte of
memory mapped and only a couple of gigabytes in RAM. Thus, you do not have to
worry if mapped or virtual memory sizes exceed RAM.
Figure 21-1 shows the MMS graph for memory information, which describes how much
resident, virtual, and mapped memory MongoDB is using. On a box dedicated to Mon‐
goDB, resident should be a little less than the total memory size (assuming your working
set is as large or larger than memory). Resident memory is that only statistic that actually
tracks how much data is in physical RAM, but by itself this stat does not tell you much
about how MongoDB is using memory.

342

|

Chapter 21: Monitoring MongoDB

Figure 21-1. From the top line to the bottom: virtual, mapped, and resident memory
If your data fits entirely in memory, resident should be approximately the size of your
data. When we talk about data being “in memory,” we’re always talking about the data
being in RAM.

Tracking Page Faults
As you can see from Figure 21-1, memory metrics tend to be fairly steady, but as your
data set grows virtual and mapped will grow with it. Resident will grow to the size of
your available RAM and then hold steady.
You can use other statistics to find out how MongoDB is using memory, not just how
much of each type it has. One useful stat is number of page faults, which tells you how
often the data MongoDB is looking for is not in RAM. Figure 21-2 and Figure 21-3 are
graphs page faults over time. Figure 21-3 is page faulting less than Figure 21-2, but by
itself this information is not very useful. If the disk in Figure 21-2 can handle that many
faults and the application can handle the delay of the disk seeks, there is no particular
problem with having so many faults (or more). On the other hand, if your application
cannot handle the increased latency of reading data from disk, you have no choice but
to store all of your data in memory (or use SSDs).

Monitoring Memory Usage

|

343

Figure 21-2. A system that is page faulting hundreds of times a minute

Figure 21-3. A system that is page faulting a few times a minute
Regardless of how forgiving the application is, page faults become a problem when the
disk is overloaded. The amount of load a disk can handle isn’t linear: once a disk begins
getting overloaded, each operation must queue for a longer and longer period of time,
creating a chain reaction. There is usually a tipping point where disk performance begins
degrading quickly. Thus, it is a good idea to stay away from the maximum load that your
disk can handle.
Track your page fault numbers over time. If your application is behaving well with a
certain number of page faults, you have a baseline for how many page faults the system
can handle. If page faults begin to creep up and performance deteriorates, you have a
threshold to alert on.
You can see page fault stats per-database by looking at "recordStats" field of server‐
Status’s output:
> db.adminCommand({"serverStatus" : 1})["recordStats"]
{
"accessesNotInMemory": 200632,
"test": {

344

|

Chapter 21: Monitoring MongoDB