Previously, only the top-level exception code was reported via the
Crashpad.ExceptionCode.Mac histogram. Making this histogram work
(https://crbug.com/678720) has revealed that Chrome is triggering
EXC_RESOURCE exceptions at a rate in excess of 4x that of ordinary
crashes. These exceptions were not previously visible because they are
not uploaded unless the system treats them as fatal, which it does not
normally do absent an explicit request.
In order to learn more about the problem, this change augments the data
reported via the Crashpad.ExceptionCode.Mac histogram to report (at
least) second-level exception data. This means that we will no longer
see just EXC_RESOURCE, but potentially more useful information such as
EXC_RESOURCE / RESOURCE_TYPE_IO / FLAVOR_IO_PHYSICAL_WRITES. This also
applies to other exception types, so that the majority of crashes
currently falling into the EXC_CRASH bucket will now have additional
information decoded and will be reported as, for example, EXC_BAD_ACCESS
/ KERN_INVALID_ADDRESS, EXC_BAD_INSTRUCTION / EXC_I386_INVOP, and
EXC_CRASH / SIGABRT.
Because the old mechanism was only live (in an “it works” sense) for
several days, and the new mechanism does not overlap with histogram
values used by the old one, there’s no need to invent a new histogram
name.
BUG=chromium:684051
Change-Id: Ia0a372b4127f7b3b2e7dbbaac9304cce3b5aadfe
Reviewed-on: https://chromium-review.googlesource.com/430933
Reviewed-by: Scott Graham <scottmg@chromium.org>
Includes mini_chromium DEPS roll for:
88e0a3e Add stub of sparse_histogram.h
R=mark@chromium.org
BUG=crashpad:100
Change-Id: I4c541a33be0f7f47e972af638d4765bd06682acf
Reviewed-on: https://chromium-review.googlesource.com/386385
Reviewed-by: Mark Mentovai <mark@chromium.org>
In https://codereview.chromium.org/1411523006, the Mach port scopers are
becoming better ScopedGenerics and are losing the type conversion
operators in the process. This is needed to adapt to that change. get()
is ugly, but being explicit about conversion isn’t a bad thing, and
these scopers will gain functionality such as Pass() as part of the
switch.
As a bonus, some would-be uses of get() to check for valid port rights
are becoming a more descriptive is_valid().
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1405273002 .
launchd actually does set the EXC_RESOURCE and EXC_GUARD handlers
exactly the same way that it sets the EXC_CRASH handler. See 10.9.5
launchd-842.92.1/src/core.c job_setup_exception_port().
Cases where an EXC_CRASH handler is set but EXC_RESOURCE and EXC_GUARD
handlers are not set occur when the exception ports are set by
/usr/bin/login instead of launchd. login looks up the
exception-reporting service by name and sets the exception port without
including EXC_MASK_RESOURCE or EXC_MASK_GUARD in the mask. See 10.10.5
system_cmds-643.30.1/login.tproj/login.c main().
login is a setuid executable, so it does not inherit its parent process’
exception handlers. See 10.10.5 xnu-2782.40.9/osfmk/kern/ipc_tt.c
ipc_task_reset().
Terminal.app executes login when establishing its command-line
environment, so the exception handlers set for Terminal.app itself
(including EXC_MASK_CRASH, EXC_MASK_RESOURCE, and EXC_MASK_GUARD) are
discarded, and then login sets an exception handler only for
EXC_MASK_CRASH. The same thing occurs for any other process descended
from login, including SSH sessions, because sshd executes login.
This is a bug in login filed as Apple radar 22978644. This bug led to a
misunderstanding about the use of EXC_RESOURCE and EXC_GUARD. Comments
that discuss this behavior are now reworded to be accurate, and
non-fatal EXC_RESOURCE exceptions are made eligible for forwarding to
the user ReportCrash (because it would normally handle them in the
absence of Crashpad) while Crashpad itself will still skip processing
them.
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1391453002 .
Chrome’s relauncher process needs a way to sever ties with the
crashpad_handler instance running from the disk image in order to cause
that instance to exit so that the disk image may be unmounted. This new
function is otherwise not thought to be interesting, and its use is not
recommended.
This comes with a small refactoring to create a
SystemCrashReporterHandler() function, and a fix for a minor port leak
in CrashReportExceptionHandler::CatchMachException().
BUG=chromium:538373
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1375573005 .
CrashReportExceptionHandler::CatchMachException() must always set a
valid new_state. Failing to do so appears to trigger corpse generation
on OS X 10.11. This is addressed by calling ExcServerCopyState().
Previously, this was not done for exceptions forwarded to the user
ReportCrash, under the apparent mistaken assumption that ReportCrash
would do it. However, ReportCrash is given copies of out-parameters like
new_state to explicitly prevent it from influencing Crashpad’s returned
state.
ExcServerSuccessfulReturnValue() must not return MACH_RCV_PORT_DIED for
an EXC_CRASH handler on OS X 10.11. This appears to trigger corpse
generation. This is addressed by always returning KERN_SUCCESS from
EXC_CRASH handlers on OS X 10.11.
This also adds generic EXC_CORPSE_NOTIFY support throughout Crashpad.
The crashpad_handler does not listen for this exception type, but it is
now possible to work with this exception type using tools like
exception_port_tool and catch_exception_tool.
BUG=crashpad:48
TEST=Crashes handled by crashpad_handler do not result in the generation
of reports in the root /Library/Logs/DiagnosticReports.
R=kerrnel@chromium.org, rsesek@chromium.org
Review URL: https://codereview.chromium.org/1305893010 .
The main goal was to get the beginnings of module iteration and retrieval
of CrashpadInfo in snapshot. The main change for that is to move
crashpad_info_client_options[_test] down out of mac/.
This also requires adding some of the supporting code of snapshot in
ProcessReaderWin, ProcessSnapshotWin, and ModuleSnapshotWin. These are
partially copied from Mac or stubbed out with lots of TODO annotations.
This is a bit unfortunate, but seemed like the most productive way to
make progress incrementally. That is, it's mostly placeholder at the
moment, but hopefully has the right shape for things to come.
R=mark@chromium.org
BUG=crashpad:1
Review URL: https://codereview.chromium.org/1052813002
This adds IsExceptionNonfatalResource() and its test, and uses it in
crashpad_handler. When non-fatal resource exceptions are encountered, no
crash report is generated. crashpad_handler swallows these exceptions.
Alternatively, it could allow them to be sent to the system’s host-level
resource exception handler, normally com.apple.ReportCrash.root, which
would allow them to be processed in the same way as when Crashpad is not
in use. I’m not sure which option is better. I chose to swallow them
because there doesn’t appear to be much value in letting
com.apple.ReportCrash.root and spindump look at them.
This also moves ExcCrashRecoverOriginalException() to the new file as a
sibling of IsExceptionNonfatalResource(). This provides better
organization.
BUG=crashpad:35, chromium:474163, chromium:474326
TEST=crashpad_util_test ExceptionTypes.IsExceptionNonfatalResource
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1066243002
These two exception types use all 64 bits of the code[0] field. The
ExceptionSnapshot was unprepared to stuff this into a 32-bit field. To
resolve the discrepancy, the more-significant data is taken from the
high 32 bits of code[0]. No information is lost because the full code[0]
is made available as part of the Codes() vector.
BUG=crashpad:34
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1050313003
ExcServerCopyState() properly sets the new_state and new_state_count
out-parameters for exception handler routines that may deal with
state-carrying exceptions.
This used to exist inline in catch_exception_tool, but that
implementation had a bug caught by the new test.
TEST=crashpad_util_test ExcServerVariants.ExcServerCopyState and others
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1049023003
Now that Chrome’s about:crashes displays the crash report UUID, I wanted
to add it to the minidump. In the future, we may be able to index these
on the server. This will also help identify dumps that correspond to the
same event once we’re equipped to convert between different formats.
Ideally, this new field is populated with the same UUID used locally in
the crash report database. To make this work,
CrashReportDatabase::NewReport must carry the UUID. This was actually
part of CrashReportDatabaseWin’s private extension to NewReport, so that
extension subclass can now be cleaned up.
TEST=crashpad_minidump_test MinidumpCrashpadInfoWriter.*,
crashpad_client_test CrashReportDatabaseTest.NewCrashReport
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1000263003
origin.
This adds AuditPIDFromMachMessageTrailer() to get the process ID of a
Mach message’s sender. Exception messages are considered suspicious when
not sent by the kernel or the exception process.
TEST=crashpad_util_test
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/1001943002
The client ID is added to a new field, MinidumpCrashpadInfo::client_id,
in each minidump file that is written. The ProcessSnapshot::ClientID()
gives access to value at the snapshot level. In the upload thread,
client IDs are retrieved from minidump files and used to populate the
“guid” HTTP form parameter.
The Breakpad client supplies these values at upload without hyphens and
with all capital letters. Currently, the Crashpad client uses hyphens
and lowercase letters when communicating with a Breakpad server.
TEST=crashpad_minidump_test MinidumpCrashpadInfoWriter.*,
crashpad_snapshot_test ProcessSnapshotMinidump.*,
run_with_crashpad --handler crashpad_handler \
-a --database=/tmp/crashpad_db \
-a --url=https://clients2.google.com/cr/staging_report \
-a --annotation=prod=crashpad \
-a --annotation=ver=0.7.0 \
crashy_program
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/998033002
disabled.
ClientInfo::set_system_crash_reporter_forwarding() can be used to
disable forwarding. The first module that is found with a non-default
value in this field will dictate whether forwarding is enabled or
disabled. It is possible to enable or disable reporting with this call,
as well as reset it to default, which will allow later modules a chance
to influence the behavior.
ClientInfo::set_crashpad_handler_behavior() is also provided, which can
be used to disable Crashpad’s handling of the exception. Most users
should not call this, but should use Settings::SetUploadsEnabled()
instead.
TEST=crashpad_snapshot_test \
CrashpadInfoClientOptions.*:MachOImageReader.Self_DyldImages; \
run_with_crashpad --handler crashpad_handler \
-a --database=/tmp/crashpad_db \
-a --url=https://clients2.google.com/cr/staging_report \
-a --annotation=prod=crashpad \
-a --annotation=ver=0.7.0 \
crashy_program
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/997713002
The handler is now capable of uploading crash reports from the database.
At present, only one upload attempt is made, and the report will be
moved to “completed” in the database after the attempt, regardless of
whether it succeeded or failed.
The handler also has support to push annotations from its command line
into the process annotations map of each crash report it writes. This is
intended to set basic information about each crash report, such as the
product ID and version. Each potentially crashy process can’t be relied
on to maintain this information on their own.
With this change, Crashpad is now 100% capable of running a handler that
maintains a database and uploads crash reports to a Breakpad-type server
such that Breakpad properly interprets the reports. This is all possible
from the command line.
TEST=run_with_crashpad --handler crashpad_handler \
-a --database=/tmp/crashpad_db \
-a --url=https://clients2.google.com/cr/staging_report \
-a --annotation=prod=crashpad \
-a --annotation=ver=0.6.0 \
crashy_program
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/982613002
Upload isn’t actually hooked up yet, but this establishes the upload
thread and provides all of the plumbing to process pending reports. For
the time being, SkipReportUpload() is called for all pending reports.
R=rsesek@chromium.org
Review URL: https://codereview.chromium.org/918743002