readdir_r() is a thread-safe version of readdir(), although readdir() is
not particularly thread-unsafe with most usage. The dirent* returned by
readdir() can only be invalidated by a subsequent readdir() or
closedir() on the same DIR*. In typical usage, where a returned dirent*
is used exclusively within a loop around readdir() and is not expected
to outlive that loop, there are no lifetime or thread-safety issues with
the use of readdir().
readdir_r() may be harmful in certain situations because its buffer is
not explicitly sized, and attempts to provide a suitably sized buffer
dynamically (which, incidentally, our code did not do) are subject to a
race condition.
https://elliotth.blogspot.com/2012/10/how-not-to-use-readdirr3.htmlhttps://womble.decadent.org.uk/readdir_r-advisory.html
glibc has already deprecated readdir_r(), and all Linux (including
Android) code was already using readdir(). This change eliminates
variant codepaths. It delegates buffer sizing (which we weren’t doing
correctly) to the C library, which also has more options at its disposal
to avoid races in sizing that buffer.
Change-Id: I4fca8948454116360180ad0017f226d06727ef81
Reviewed-on: https://chromium-review.googlesource.com/705756
Reviewed-by: Joshua Peraza <jperaza@chromium.org>
The "file-limit" annotation has shown that the system as a whole is not
likely to be out of file descriptors globally. It’s possible that a file
descriptor leak in crashpad_handler itself is responsible for certain
crashes. Add a count of the number of open files in the handler process
to this annotation to test this theory.
Bug: crashpad:180
Change-Id: If6f2304fdabddd29636ba4ac5a7d1e0fff7f4b61
Reviewed-on: https://chromium-review.googlesource.com/585852
Reviewed-by: Robert Sesek <rsesek@chromium.org>
The "file-limit" annotation will be used to confirm the theory that
certain crashes are caused by systems at or near their file descriptor
table size limits.
The annotation records the system-wide kern.num_files and kern.maxfiles
values, and the process-specific current and maximum file descriptor
limits.
The annotation will be set on crashpad_handler startup, and will be
refreshed every time an exception is handled and every time the upload
thread processes a pending report.
It’s expected that this annotation will be removed after enough data has
been collected to confirm the theory. However, the principle is useful
enough that we may want to provide this feature more generally under
bugs 19 or 21.
Bug: crashpad:180
Change-Id: I3bb78fae60e0567bc4ac2625716e0abe0ddae08c
Reviewed-on: https://chromium-review.googlesource.com/479914
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Self-monitoring revealed this CHECK was being hit in the wild:
base::debug::BreakDebugger() debugger_posix.cc:260
logging::LogMessage::~LogMessage() logging.cc:759
logging::MachLogMessage::~MachLogMessage() mach_logging.cc:45
crashpad::ExceptionHandlerServer::Run() exception_handler_server.cc:108
crashpad::HandlerMain() handler_main.cc:744
The MACH_CHECK() was:
108 MACH_CHECK(mr == MACH_MSG_SUCCESS, mr) << "MachMessageServer::Run";
Crash reports captured the full message, including the value of mr:
[0418/015158.777231:FATAL:exception_handler_server.cc(108)] Check failed: mr == MACH_MSG_SUCCESS. MachMessageServer::Run: (ipc/send) invalid destination port (0x10000003)
0x10000003 = MACH_SEND_INVALID_DEST.
This can happen when attempting to send a Mach message to a dead name.
Send (and send-once) rights become dead names when the corresponding
receive right dies. This would not normally happen for exception
requests originating in the kernel. It can happen for requests
originating from a user task: when the user task dies, the receive right
dies with it. All it takes to trigger this CHECK() in crashpad_handler
is for a Crashpad client to die (or be killed) while the handler is
processing a SimulateCrash() that the client originated.
Accept MACH_SEND_INVALID_DEST as a valid return value for
MachMessageServer::Run().
Note that MachMessageServer’s test coverage was already aware of this
behavior. MachMessageServer::Run()’s documentation is updated to reflect
it too.
Change-Id: I483c065d3c5f9a7da410ef3ad54db45ee53aa3c2
Reviewed-on: https://chromium-review.googlesource.com/479093
Commit-Queue: Mark Mentovai <mark@chromium.org>
Reviewed-by: Robert Sesek <rsesek@chromium.org>
76a67a37b1d0 adds crashpad_handler’s --monitor-self argument, which
results in a second crashpad_handler instance running out of the same
database as the initial crashpad_handler instance that it monitors. The
two handlers start at nearly the same time, and will initially be on
precisely the same schedule for periodic tasks such as scanning for new
reports to upload and pruning the database. This is an unnecessary
duplication of effort.
This adds a new --no-periodic-tasks argument to crashpad_handler. When
the first instance of crashpad_handler starts a second to monitor it, it
will use this argument, which prevents the second instance from
performing these tasks.
When --no-periodic-tasks is in effect, crashpad_handler will still be
able to upload crash reports that it knows about by virtue of having
written them itself, but it will not scan the database for other pending
reports to upload.
Bug: crashpad:143
Test: crashpad_util_test ThreadSafeVector.ThreadSafeVector
Change-Id: I7b249dd7b6d5782448d8071855818f986b98ab5a
Reviewed-on: https://chromium-review.googlesource.com/473827
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Previously, only the top-level exception code was reported via the
Crashpad.ExceptionCode.Mac histogram. Making this histogram work
(https://crbug.com/678720) has revealed that Chrome is triggering
EXC_RESOURCE exceptions at a rate in excess of 4x that of ordinary
crashes. These exceptions were not previously visible because they are
not uploaded unless the system treats them as fatal, which it does not
normally do absent an explicit request.
In order to learn more about the problem, this change augments the data
reported via the Crashpad.ExceptionCode.Mac histogram to report (at
least) second-level exception data. This means that we will no longer
see just EXC_RESOURCE, but potentially more useful information such as
EXC_RESOURCE / RESOURCE_TYPE_IO / FLAVOR_IO_PHYSICAL_WRITES. This also
applies to other exception types, so that the majority of crashes
currently falling into the EXC_CRASH bucket will now have additional
information decoded and will be reported as, for example, EXC_BAD_ACCESS
/ KERN_INVALID_ADDRESS, EXC_BAD_INSTRUCTION / EXC_I386_INVOP, and
EXC_CRASH / SIGABRT.
Because the old mechanism was only live (in an “it works” sense) for
several days, and the new mechanism does not overlap with histogram
values used by the old one, there’s no need to invent a new histogram
name.
BUG=chromium:684051
Change-Id: Ia0a372b4127f7b3b2e7dbbaac9304cce3b5aadfe
Reviewed-on: https://chromium-review.googlesource.com/430933
Reviewed-by: Scott Graham <scottmg@chromium.org>
Includes mini_chromium DEPS roll for:
88e0a3e Add stub of sparse_histogram.h
R=mark@chromium.org
BUG=crashpad:100
Change-Id: I4c541a33be0f7f47e972af638d4765bd06682acf
Reviewed-on: https://chromium-review.googlesource.com/386385
Reviewed-by: Mark Mentovai <mark@chromium.org>
This was done in Chromium’s local copy of Crashpad in 562827afb599. This
change is similar to that one, except more care was taken to avoid
including headers from a .cc or _test.cc when already included by the
associated .h. Rather than using <stddef.h> for size_t, Crashpad has
always used <sys/types.h>, so that’s used here as well.
This updates mini_chromium to 8a2363f486e3a0dc562a68884832d06d28d38dcc,
which removes base/basictypes.h.
e128dcf10122 Remove base/move.h; use std::move() instead of Pass()
8a2363f486e3 Move basictypes.h to macros.h
R=avi@chromium.org
Review URL: https://codereview.chromium.org/1566713002 .
This more-natural spelling doesn’t require Crashpad developers to have
to remember anything special when writing code in Crashpad. It’s easier
to grep for and it’s easier to remove the “compat” part when pre-C++11
libraries are no longer relevant.
R=scottmg@chromium.org
Review URL: https://codereview.chromium.org/1513573005 .
Previously, there was a window after starting the upload thread but
before the SIGTERM handler was installed, where receipt of SIGTERM
could have interrupted an in-progress upload. There was also the
possibility that a second SIGTERM sent after the exception handler
stopped running would interrupt an in-progress upload. By pulling the
signal handler out of ExceptionHandlerServer and into the main
function, these races are avoided.
BUG=crashpad:25
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1429353002 .
By invoking crashpad_handler with --mach-service instead of
--handshake-fd, the handler will run as a well-behaved launchd job. The
launchd job may be as a launch agent or launch daemon, or be submitted
to launchd by on_demand_service_tool.
BUG=crashpad:25
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1414533006 .
Each routine in this implementation returns MIG_BAD_ID. These routines
may be overridden.
Most things that implement NotifyServer::Interface will only need to
implement one of the interface routines. Since another user of
NotifyServer will be added soon, it makes sense to provide a default
no-op implementation rather than forcing everyone to write the same
no-op boilerplate repeatedly.
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1414413006 .
Previously, crashpad_handler made its own receive right, and transferred
a corresponding send right to its client. There are two advantages to
making the receive right in the client:
- It is possible to monitor the receive right for a port-destroyed
notificaiton in the client, allowing the handler to be restarted if
it dies.
- For the future run-from-launchd mode (bug crashpad:25), the handler
will obtain its receive right from the bootstrap server instead of
making its own. Having the handler get its receive right from
different sources allows more code to be shared than if it were to
sometimes get a receive right and sometimes make a receive right and
transfer a send right.
This includes a restructuring in crashpad_client_mac.cc that will make
it easier to give it an option to restart crashpad_handler if it dies.
The handler starting logic should all behave the same as before.
BUG=crashpad:68
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1409073013 .
In https://codereview.chromium.org/1411523006, the Mach port scopers are
becoming better ScopedGenerics and are losing the type conversion
operators in the process. This is needed to adapt to that change. get()
is ugly, but being explicit about conversion isn’t a bad thing, and
these scopers will gain functionality such as Pass() as part of the
switch.
As a bonus, some would-be uses of get() to check for valid port rights
are becoming a more descriptive is_valid().
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1405273002 .
launchd actually does set the EXC_RESOURCE and EXC_GUARD handlers
exactly the same way that it sets the EXC_CRASH handler. See 10.9.5
launchd-842.92.1/src/core.c job_setup_exception_port().
Cases where an EXC_CRASH handler is set but EXC_RESOURCE and EXC_GUARD
handlers are not set occur when the exception ports are set by
/usr/bin/login instead of launchd. login looks up the
exception-reporting service by name and sets the exception port without
including EXC_MASK_RESOURCE or EXC_MASK_GUARD in the mask. See 10.10.5
system_cmds-643.30.1/login.tproj/login.c main().
login is a setuid executable, so it does not inherit its parent process’
exception handlers. See 10.10.5 xnu-2782.40.9/osfmk/kern/ipc_tt.c
ipc_task_reset().
Terminal.app executes login when establishing its command-line
environment, so the exception handlers set for Terminal.app itself
(including EXC_MASK_CRASH, EXC_MASK_RESOURCE, and EXC_MASK_GUARD) are
discarded, and then login sets an exception handler only for
EXC_MASK_CRASH. The same thing occurs for any other process descended
from login, including SSH sessions, because sshd executes login.
This is a bug in login filed as Apple radar 22978644. This bug led to a
misunderstanding about the use of EXC_RESOURCE and EXC_GUARD. Comments
that discuss this behavior are now reworded to be accurate, and
non-fatal EXC_RESOURCE exceptions are made eligible for forwarding to
the user ReportCrash (because it would normally handle them in the
absence of Crashpad) while Crashpad itself will still skip processing
them.
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1391453002 .
Chrome’s relauncher process needs a way to sever ties with the
crashpad_handler instance running from the disk image in order to cause
that instance to exit so that the disk image may be unmounted. This new
function is otherwise not thought to be interesting, and its use is not
recommended.
This comes with a small refactoring to create a
SystemCrashReporterHandler() function, and a fix for a minor port leak
in CrashReportExceptionHandler::CatchMachException().
BUG=chromium:538373
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1375573005 .
CrashReportExceptionHandler::CatchMachException() must always set a
valid new_state. Failing to do so appears to trigger corpse generation
on OS X 10.11. This is addressed by calling ExcServerCopyState().
Previously, this was not done for exceptions forwarded to the user
ReportCrash, under the apparent mistaken assumption that ReportCrash
would do it. However, ReportCrash is given copies of out-parameters like
new_state to explicitly prevent it from influencing Crashpad’s returned
state.
ExcServerSuccessfulReturnValue() must not return MACH_RCV_PORT_DIED for
an EXC_CRASH handler on OS X 10.11. This appears to trigger corpse
generation. This is addressed by always returning KERN_SUCCESS from
EXC_CRASH handlers on OS X 10.11.
This also adds generic EXC_CORPSE_NOTIFY support throughout Crashpad.
The crashpad_handler does not listen for this exception type, but it is
now possible to work with this exception type using tools like
exception_port_tool and catch_exception_tool.
BUG=crashpad:48
TEST=Crashes handled by crashpad_handler do not result in the generation
of reports in the root /Library/Logs/DiagnosticReports.
R=kerrnel@chromium.org, rsesek@chromium.org
Review URL: https://codereview.chromium.org/1305893010 .
The main goal was to get the beginnings of module iteration and retrieval
of CrashpadInfo in snapshot. The main change for that is to move
crashpad_info_client_options[_test] down out of mac/.
This also requires adding some of the supporting code of snapshot in
ProcessReaderWin, ProcessSnapshotWin, and ModuleSnapshotWin. These are
partially copied from Mac or stubbed out with lots of TODO annotations.
This is a bit unfortunate, but seemed like the most productive way to
make progress incrementally. That is, it's mostly placeholder at the
moment, but hopefully has the right shape for things to come.
R=mark@chromium.org
BUG=crashpad:1
Review URL: https://codereview.chromium.org/1052813002
This adds IsExceptionNonfatalResource() and its test, and uses it in
crashpad_handler. When non-fatal resource exceptions are encountered, no
crash report is generated. crashpad_handler swallows these exceptions.
Alternatively, it could allow them to be sent to the system’s host-level
resource exception handler, normally com.apple.ReportCrash.root, which
would allow them to be processed in the same way as when Crashpad is not
in use. I’m not sure which option is better. I chose to swallow them
because there doesn’t appear to be much value in letting
com.apple.ReportCrash.root and spindump look at them.
This also moves ExcCrashRecoverOriginalException() to the new file as a
sibling of IsExceptionNonfatalResource(). This provides better
organization.
BUG=crashpad:35, chromium:474163, chromium:474326
TEST=crashpad_util_test ExceptionTypes.IsExceptionNonfatalResource
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1066243002
These two exception types use all 64 bits of the code[0] field. The
ExceptionSnapshot was unprepared to stuff this into a 32-bit field. To
resolve the discrepancy, the more-significant data is taken from the
high 32 bits of code[0]. No information is lost because the full code[0]
is made available as part of the Codes() vector.
BUG=crashpad:34
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1050313003
The wiki existed at https://code.google.com/p/crashpad/wiki, but given
Google Code Project Hosting’s impending shutdown[1], it’s prudent to
move wiki documents into the source code repository.
This change moves the existing contents of doc into doc/support, to make
way for documentation in doc. The two existing wiki pages, ProjectStatus
and DevelopingCrashpad, are converted to AsciiDoc format (a fairly
straightforward conversion) and checked in to doc. generate_asciidoc.sh
is updated to produce HTML output from these files. The generated HTML
will show up at http://docs.crashpad.googlecode.com/git/doc/. Note that
generated HTML is still hosted on Google Code Project Hosting, but it’ll
be easy to find a new home for them.
[1]
http://google-opensource.blogspot.com/2015/03/farewell-to-google-code.htmlR=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1055523002
ExcServerCopyState() properly sets the new_state and new_state_count
out-parameters for exception handler routines that may deal with
state-carrying exceptions.
This used to exist inline in catch_exception_tool, but that
implementation had a bug caught by the new test.
TEST=crashpad_util_test ExcServerVariants.ExcServerCopyState and others
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1049023003
Add MapInsertOrReplace<>() to insert a key-value pair into a map if the
key is not already present, or replace the existing value for key if the
key is present. The original value can optionally be returned to the
caller in this case.
Map insertions now use either MapInsertOrReplace<>() or
std::map<>::insert() directly.
Use MapInsertOrReplace<>() when the map should be updated to contain a
mapping from a key to a value regardless of whether the key is already
present.
Use std::map<>::insert() to insert a mapping from a key to a value
without replacing any existing mapping from a key, if present. If it is
important to know whether an existing mapping from a key was present,
use the returned std::pair<>.second. If it is important to know the
existing value, use the returned std::pair<>.first->second.
This change has a slight positive impact on performance.
TEST=crashpad_util_test MapInsert.MapInsertOrReplace and others
BUG=
R=scottmg@chromium.org
Review URL: https://codereview.chromium.org/1044273002
Now that Chrome’s about:crashes displays the crash report UUID, I wanted
to add it to the minidump. In the future, we may be able to index these
on the server. This will also help identify dumps that correspond to the
same event once we’re equipped to convert between different formats.
Ideally, this new field is populated with the same UUID used locally in
the crash report database. To make this work,
CrashReportDatabase::NewReport must carry the UUID. This was actually
part of CrashReportDatabaseWin’s private extension to NewReport, so that
extension subclass can now be cleaned up.
TEST=crashpad_minidump_test MinidumpCrashpadInfoWriter.*,
crashpad_client_test CrashReportDatabaseTest.NewCrashReport
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1000263003
This makes it easier for clients to start the Crashpad handler, instead
of requiring them to know how to construct arguments for the handler
themselves. Note in the TEST that -a is no longer required.
TEST=run_with_crashpad --handler crashpad_handler \
--database=/tmp/crashpad_db \
--url=https://clients2.google.com/cr/staging_report \
--annotation=prod=crashpad \
--annotation=ver=0.7.0 \
crashy_program
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1001993002
origin.
This adds AuditPIDFromMachMessageTrailer() to get the process ID of a
Mach message’s sender. Exception messages are considered suspicious when
not sent by the kernel or the exception process.
TEST=crashpad_util_test
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1001943002
The client ID is added to a new field, MinidumpCrashpadInfo::client_id,
in each minidump file that is written. The ProcessSnapshot::ClientID()
gives access to value at the snapshot level. In the upload thread,
client IDs are retrieved from minidump files and used to populate the
“guid” HTTP form parameter.
The Breakpad client supplies these values at upload without hyphens and
with all capital letters. Currently, the Crashpad client uses hyphens
and lowercase letters when communicating with a Breakpad server.
TEST=crashpad_minidump_test MinidumpCrashpadInfoWriter.*,
crashpad_snapshot_test ProcessSnapshotMinidump.*,
run_with_crashpad --handler crashpad_handler \
-a --database=/tmp/crashpad_db \
-a --url=https://clients2.google.com/cr/staging_report \
-a --annotation=prod=crashpad \
-a --annotation=ver=0.7.0 \
crashy_program
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/998033002
disabled.
ClientInfo::set_system_crash_reporter_forwarding() can be used to
disable forwarding. The first module that is found with a non-default
value in this field will dictate whether forwarding is enabled or
disabled. It is possible to enable or disable reporting with this call,
as well as reset it to default, which will allow later modules a chance
to influence the behavior.
ClientInfo::set_crashpad_handler_behavior() is also provided, which can
be used to disable Crashpad’s handling of the exception. Most users
should not call this, but should use Settings::SetUploadsEnabled()
instead.
TEST=crashpad_snapshot_test \
CrashpadInfoClientOptions.*:MachOImageReader.Self_DyldImages; \
run_with_crashpad --handler crashpad_handler \
-a --database=/tmp/crashpad_db \
-a --url=https://clients2.google.com/cr/staging_report \
-a --annotation=prod=crashpad \
-a --annotation=ver=0.7.0 \
crashy_program
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/997713002
upload attempts to no more than 1 per hour.
The rate limiting is simplistic but duplicates the existing Breakpad
client’s behavior, and is suitable for the time being.
TEST=run_with_crashpad --handler crashpad_handler \
-a --database=/tmp/crashpad_db \
-a --url=https://clients2.google.com/cr/staging_report \
-a --annotation=prod=crashpad \
-a --annotation=ver=0.7.0 \
crashy_program
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/992303002
A couple of the problems related to not having a C++11 library:
- You can’t put const elements into a std::vector<>, so
CrashReportDatabase::GetPendingReports() and
CrashReportDatabase::GetCompletedReports() need to change. There was
no data-safety benefit to const elements.
- std::string::pop_back() does not exist, another mechanism must be
used to trim strings in BreakpadHTTPFormParametersFromMinidump().
One relates to a feature that does not exist in 10.6:
- The O_CLOEXEC flag to open() was introduced in 10.7. Although it
would be possible to use fcntl(..., F_SETFD, FD_CLOEXEC) on 10.6, the
O_CLOEXEC behavior is just removed from
CrashReportDatabaseMac::ObtainReportLock(), in line with other open()
calls in Crashpad.
And one was a real bug:
- #define __STDC_FORMAT_MACROS before #including <inttypes.h> to get
format macros like SCNx32, used in UUID::InitializeFromString().
TEST=* (gyp_crashpad.py -Dmac_sdk=10.6 -Dmac_deployment_target=10.6)
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/987693004
The handler is now capable of uploading crash reports from the database.
At present, only one upload attempt is made, and the report will be
moved to “completed” in the database after the attempt, regardless of
whether it succeeded or failed.
The handler also has support to push annotations from its command line
into the process annotations map of each crash report it writes. This is
intended to set basic information about each crash report, such as the
product ID and version. Each potentially crashy process can’t be relied
on to maintain this information on their own.
With this change, Crashpad is now 100% capable of running a handler that
maintains a database and uploads crash reports to a Breakpad-type server
such that Breakpad properly interprets the reports. This is all possible
from the command line.
TEST=run_with_crashpad --handler crashpad_handler \
-a --database=/tmp/crashpad_db \
-a --url=https://clients2.google.com/cr/staging_report \
-a --annotation=prod=crashpad \
-a --annotation=ver=0.6.0 \
crashy_program
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/982613002
Upload isn’t actually hooked up yet, but this establishes the upload
thread and provides all of the plumbing to process pending reports. For
the time being, SkipReportUpload() is called for all pending reports.
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/918743002
crashpad_handler is Crashpad’s exception handler server.
Currently, it runs a loop to receive exceptions, and exits when it no
longer has any clients. In the future, this will be extended to write
and potentially upload dumps.
The handler is expected to be started by its initial client via the
CrashpadClient interface.
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/789693005