Connect with us

FACEBOOK

Async stack traces in folly: Synchronous and asynchronous stack traces

Published

on

This article was written by Lee Howes and Lewis Baker from Facebook.

This is the second in a series of posts covering how we have used C++ coroutines at Facebook to regain stack traces for dependent chains of asynchronous waiting tasks. In the previous blog post we talked about the work we have done to implement stack traces for asynchronous coroutine code. Here we’ll go into more detail on the technical differences and challenges involved in implementing async stack traces on top of C++ coroutines, compared with traditional stack traces.

Normal synchronous stack traces

With normal stacks, when you call a function the compiler generates code that automatically maintains a linked list of stack frames and this list represents the call stack. At the start of each frame is a structure that (at least on Intel architectures) looks like this:

struct stack_frame { stack_frame* nextFrame; void* returnAddress;
}; 

This structure is usually filled out by specialised assembly instructions. e.g. in x86_64, a caller will execute a call instruction, which pushes the return address on the stack and jumps to the function entry point. Then the first instructions of the callee pushes the ebp register (which usually holds the pointer to the current stack_frame structure) onto the stack and then copies the esp register (which now contains the pointer to the stack_frame structure we just populated) into the ebp register.

For example:

Advertisement
free widgets for website
caller: ... call callee # Pushes address of next instruction onto stack, # populating 'returnAddress' member of 'stack_frame'. # Then jumps to 'callee' address. mov rsp[-16], rax # Save the result somewhere ... callee: push rbp # Push rbp (stack_frame ptr) onto stack (populates 'nextFrame' member) mov rbp, rsp # Update rbp to point to new stack_frame sub rsp, 16 # Reserve an additional 16 bytes of stack-space ... mov rax, 42 # Set return-value to 42 leave # Copy rbp -> rsp, pop rbp from stack ret # Pop return address from top of stack and jump to it 

When a debugger or profiler captures a stack trace for a given thread, it obtains a pointer to the first stack frame from the thread’s ebp register and then starts walking this linked list until it reaches the stack root, recording the return addresses it sees along the way in a buffer.

Subsequent profiling tools may then translate the addresses to function names and/or file+line numbers using a symbolizer that makes use of debug info for the binary, and this information may be logged or displayed as useful for the tool in question.

Usually these stack frames live in a single contiguous memory region and the data structure looks a bit like this:

If we want to walk the async-stack trace instead of the normal stack trace then we still want to first start walking normal stack frames just like we do for a normal stack trace. A coroutine may call normal functions and we want to include the frames for these normal function calls in the stack trace. However, when we get to the frame corresponding to the top-most coroutine (in this case coro_function_1) we do not want to follow the ‘nextFrame‘ link into the coroutine_handle::resume method as the normal stack walking would. Instead we need a link to the waiting coroutine. At this point in the trace the histories for the normal stack trace and the async-stack trace diverge. To walk an async-stack trace involves answering a few questions:

  • How do we identify which stack frames correspond to async frames?
  • How do we find the address of the next async frame?
  • How and where do we allocate storage for the async frame data?
See also  Covid misinformation on Facebook is killing people - Biden

Before we can implement any of this, we need to understand a bit more about how coroutines are structured.

How do coroutine “stacks” differ from normal stacks?

When you call a coroutine in C++, this allocates storage for a coroutine frame. The allocation is usually obtained from the heap although the compiler is free to optimise out this allocation in some circumstances, for example by inlining the allocation into the frame of the caller.

Advertisement
free widgets for website

The compiler uses the coroutine frame storage to store all of the state that needs to be preserved when the coroutine is suspended so that it is available when the coroutine is later resumed. This usually includes storage for function parameters, local variables, temporaries and any other state the compiler deems necessary, such as at which suspend-point the coroutine is suspended.

The coroutine frame also includes storage of a special object, the coroutine promise, which controls the behaviour of the coroutine. The compiler lowers your coroutine function into a sequence of calls to methods on the coroutine promise object at certain key points within the coroutine body, in addition to the user-written code of the coroutine. The coroutine promise controls the behaviour of the coroutine by implementing the desired behaviour in these methods. For more details about the promise type see the blog-post Understanding the promise type.

The promise type is determined based on the signature of the coroutine function and for most coroutine types is based solely on the return-type of the coroutine. This allows coroutines that return a given return type (eg. folly::coro::Task<T>) to store additional per-coroutine-frame data within the coroutine frame by adding data members to the promise type.

For the clang implementation, the layout of a coroutine frame for a given coroutine function looks a bit like this:

struct __foo_frame { using promise_type = typename std::coroutine_traits<Ret, Arg1, Arg2>::promise_type; void(*resumeFn)(void*); // coroutine_handle::resume() function-pointer void(*destroyFn)(void*); // coroutine_handle::destroy() function-pointer promise_type promise; // coroutine promise object int suspendPoint; // keeps track of which suspend-point coroutine is suspended at char extra[458]; // extra storage space for local variables, parameters, // temporaries, spilled registers, etc.
}; 

When a coroutine is suspended, all of the state for that coroutine invocation is stored in the coroutine frame and there is no corresponding stack frame. However, when a coroutine is resumed on a given thread, this activates a stack frame for that coroutine on that thread’s stack (like a normal function) and this stack frame is used for any temporary storage whose lifetime does not span a suspend point.

Advertisement
free widgets for website

When a coroutine body is currently executing, the pointer to the current coroutine frame is usually held in a register, which allows it to quickly reference state within the coroutine frame. However, this address may also be spilled into the stack frame in some cases, in particular when the coroutine is calling another function.

See also  Waiting for your $345 from the Illinois Facebook privacy settlement? Here's why it's delayed.

Thus, when a coroutine is active and has called another function, the memory layout will look something like this:

Note in particular that walking the asynchronous version of the stack means diverging into heap memory by following the framePointer in coro_function_1. Unlike the pointer to the current stack frame, which can generally be assumed to be stored in the rbp register, there is no standard location for the pointer to the coroutine frame. This has implications for how we are able to navigate from a stack-frame to its corresponding async-frame data.

Chaining asynchronous stack frames

To be able to produce a stack trace that represents the async-call chain instead of the normal synchronous call chain, we need to be able to walk the chain of coroutine frames, recording the return/continuation address of each coroutine as we walk the chain.

The first piece of the puzzle to be solved is storing the state needed to be able to walk the async stack-frames during an async stack trace. For each async-stack frame we need to be able to determine the address of the next async-stack frame and we need to be able to determine the return address of the current stack-frame.

Advertisement
free widgets for website

One of the constraints to be aware of here is that the code that is going to be walking the stack trace is not necessarily going to have access to debug information for the program. Profiling tooling, for example, may want to sample function offsets only and symbolize later, as it does for synchronous stack traces. We must be able to walk the async stack without needing additional complex data structures which could make the stack walk overly expensive. For example, profiling tools built on the Linux eBPF facility must be able to execute in a deterministic, finite amount of time.

There is technically enough information in the folly::coro::Task‘s promise type, folly::coro::TaskPromise, to be able to walk to the next frame, as it’s already storing the coroutine_handle for the continuation and the coroutine frame of that continuation already encodes information about which suspend point of which coroutine function is awaiting it in its resumeFn and suspendPoint members. However, there are some challenges with trying to use this information directly in walking an async-stack trace.

Representing a chain of async stack-frames

If we have a pointer to a coroutine-frame stored in a coroutine_handle then in theory, if we know the layout of the promise object, we can calculate the address of the ‘continuation‘ member of the promise which contains the address of the next coroutine-frame simply by adding a constant offset to the coroutine-frame pointer.

One approach is that we might require that all promise types store a coroutine_handle with the continuation as their first data-member:

template<typename T>
struct TaskPromise { std::coroutine_handle<void> continuation; Try<T> result; ...
}; struct __some_coroutine_frame { void(*resumeFn)(void*); void(*destroyFn)(void*); TaskPromise<int> promise; int suspendPoint;
}; 

Then, even if we do not know the concrete promise type, we know that its first member is a coroutine_handle and that the promise is placed immediately after the two function pointers. From the perspective of a debugger walking the async-stack trace it could assume that coroutine frames look like:

Advertisement
free widgets for website
struct coroutine_frame { void(*resumeFn)(void*); void(*destroyFn)(void*); coroutine_frame* nextFrame;
}; 

Unfortunately, this approach breaks down when the promise-type is overaligned (that is, it has an alignment larger than 2 pointers: 32 bytes or larger on 64-bit platforms). This can happen if the folly::coro::Task<T> type is instantiated for an overaligned type, T, for example, a matrix type optimised for use with SIMD instructions.

See also  Facebook dithered in curbing divisive user content in India | NewsChannel 3-12 - KEYT

In such cases the compiler inserts padding between the function pointers and the promise object in the structure to ensure that the promise is correctly aligned. This variation in layout makes it much more difficult to determine what offset to look at for the next coroutine-frame address because it is type dependent; the debugger needs to know something about the layout of the promise type to be able to calculate the offset.

In theory we could look at the value of the resumeFn/destroyFn pointers to lookup the promise type that corresponds to the coroutine body in a translation table, but this would either require debug information or by modifying the compiler to encode this information in the binary. We cannot assume the availability of debug info, and modifying the compiler is a much larger project. Other approaches are possible, such as changing the ABI of coroutine-frames to eliminate the padding, but these also require compiler changes, and would make the implementation compiler dependent.

The approach we took instead is to insert a new `folly::AsyncStackFrame` data-structure as a member of the coroutine promise and use these to form an intrusive linked list of async frames. That is, a structure that looks something like this:

namespace folly { struct AsyncStackFrame { AsyncStackFrame* parentFrame; // other members... };
} 

that can then be added as a member to the coroutine promise objects:

Advertisement
free widgets for website
namespace folly::coro { class TaskPromiseBase { ... private: std::coroutine_handle<> continuation_; AsyncStackFrame asyncFrame_; ... }; 

Whenever we launch a child coroutine by co_awaiting it we can hook up that child coroutine’s AsyncStackFrame so that its parentFrame member points to the parent coroutine’s AsyncStackFrame object.

Using a separate data structure gives us a lot of flexibility in how we represent async-stack traces. It insulates the data structures from any dependence on compiler internals, and will allow us to reuse AsyncStackFrame objects for non-coroutine async operations in future. It comes at a small memory and run time cost as we now effectively have two pointers to the parent coroutine to store and maintain. This decision can be revisited in future if we later want to squeeze some more performance by making some of the previously mentioned compiler changes.

Now we have a way to represent a chain of async-frames that can be walked by a debugger without needing to know anything about the concrete promise type. In the next post in the series we will look at how to determine the return-address of a coroutine and how to use these data structures to hook coroutine frames into a chain at runtime.

To learn more about Facebook Open Source, visit our open source site, subscribe to our YouTube channel, or follow us on Twitter and Facebook.

Interested in working with open source technologies at Facebook? Check out our open source-related job postings on our career page.

Advertisement
free widgets for website

Facebook Developers

Continue Reading
Advertisement free widgets for website

FACEBOOK

Updating Special Ad Audiences for housing, employment, and credit advertisers

Published

on

By

updating-special-ad-audiences-for-housing,-employment,-and-credit-advertisers

On June 21, 2022 we announced an important settlement with the US Department of Housing and Urban Development (HUD) that will change the way we deliver housing ads to people residing in the US. Specifically, we are building into our ads system a method designed to make sure the audience that ends up seeing a housing ad more closely reflects the eligible targeted audience for that ad.

As part of this agreement, we will also be sunsetting Special Ad Audiences, a tool that lets advertisers expand their audiences for ad sets related to housing. We are choosing to sunset this for employment and credit ads as well. In 2019, in addition to eliminating certain targeting options for housing, employment and credit ads, we introduced Special Ad Audiences as an alternative to Lookalike Audiences. But the field of fairness in machine learning is a dynamic and evolving one, and Special Ad Audiences was an early way to address concerns. Now, our focus will move to new approaches to improve fairness, including the method previously announced.

What’s happening: We’re removing the ability to create Special Ad Audiences via Ads Manager beginning on August 25, 2022.

Beginning October 12th, 2022, we will pause any remaining ad sets that contain Special Ad Audiences. These ad sets may be restarted once advertisers have removed any and all Special Ad Audiences from those ad sets. We are providing a two month window between preventing new Special Ad Audiences and pausing existing Special Ad Audiences to enable advertisers the time to adjust budgets and strategies as needed.

See also  Facebook Would Have Let Hitler Post Anti-Semitic Ads, Says Sacha Baron Cohen

For more details, please visit our Newsroom post.

Advertisement
free widgets for website

Impact to Advertisers using Marketing API on September 13, 2022

For advertisers and partners using the API listed below, the blocking of new Special Ad Audience creation will present a breaking change on all versions. Beginning August 15, 2022, developers can start to implement the code changes, and will have until September 13, 2022, when the non-versioning change occurs and prior values are deprecated. Refer below to the list of impacted endpoints related to this deprecation:

For reading audience:

  • endpoint gr:get:AdAccount/customaudiences
  • field operation_status

For adset creation:

  • endpoint gr:post:AdAccount/adsets
  • field subtype

For adset editing:

  • endpoint gr:post:AdCampaign
  • field subtype

For custom audience creation:

  • endpoint gr:post:AdAccount/customaudiences
  • field subtype

For custom audience editing:

  • endpoint gr:post:CustomAudience

Please refer to the developer documentation for further details to support code implementation.

First seen at developers.facebook.com

Advertisement
free widgets for website
Continue Reading

FACEBOOK

Introducing an Update to the Data Protection Assessment

Published

on

By

introducing-an-update-to-the-data-protection-assessment

Over the coming year, some apps with access to certain types of user data on our platforms will be required to complete the annual Data Protection Assessment. We have made a number of improvements to this process since our launch last year, when we introduced our first iteration of the assessment.

The updated Data Protection Assessment will include a new developer experience that is enhanced through streamlined communications, direct support, and clear status updates. Today, we’re sharing what you can expect from these new updates and how you can best prepare for completing this important privacy requirement if your app is within scope.

If your app is in scope for the Data Protection Assessment, and you’re an app admin, you’ll receive an email and a message in your app’s Alert Inbox when it’s time to complete the annual assessment. You and your team of experts will then have 60 calendar days to complete the assessment. We’ve built a new platform that enhances the user experience of completing the Data Protection Assessment. These updates to the platform are based on learnings over the past year from our partnership with the developer community. When completing the assessment, you can expect:

  • Streamlined communication: All communications and required actions will be through the My Apps page. You’ll be notified of pending communications requiring your response via your Alerts Inbox, email, and notifications in the My Apps page.

    Note: Other programs may still communicate with you through the App Contact Email.

  • Available support: Ability to engage with Meta teams via the Support tool to seek clarification on the questions within the Data Protection Assessment prior to submission and help with any requests for more info, or to resolve violations.

    Note: To access this feature, you will need to add the app and app admins to your Business Manager. Please refer to those links for step-by-step guides.

  • Clear status updates: Easy to understand status and timeline indicators throughout the process in the App Dashboard, App Settings, and My Apps page.
  • Straightforward reviewer follow-ups: Streamlined experience for any follow-ups from our reviewers, all via developers.facebook.com.

We’ve included a brief video that provides a walkthrough of the experience you’ll have with the Data Protection Assessment:

Something Went Wrong

Advertisement
free widgets for website

We’re having trouble playing this video.

The Data Protection Assessment elevates the importance of data security and helps gain the trust of the billions of people who use our products and services around the world. That’s why we are committed to providing a seamless experience for our partners as you complete this important privacy requirement.

Here is what you can do now to prepare for the assessment:

  1. Make sure you are reachable: Update your developer or business account contact email and notification settings.
  2. Review the questions in the Data Protection Assessment and engage with your teams on how best to answer these questions. You may have to enlist the help of your legal and information security points of contact to answer some parts of the assessment.
  3. Review Meta Platform Terms and our Developer Policies.

We know that when people choose to share their data, we’re able to work with the developer community to safely deliver rich and relevant experiences that create value for people and businesses. It’s a privilege we share when people grant us access to their data, and it’s imperative that we protect that data in order to maintain and build upon their trust. This is why the Data Protection Assessment focuses on data use, data sharing and data security.

Data privacy is challenging and complex, and we’re dedicated to continuously improving the processes to safeguard user privacy on our platform. Thank you for partnering with us as we continue to build a safer, more sustainable platform.

First seen at developers.facebook.com

Advertisement
free widgets for website
See also  Cambridge Analytica Deceived Facebook Users, US FTC Finds
Continue Reading

FACEBOOK

Resources for Completing App Store Data Practice Questionnaires for Apps That Include the Facebook or Audience Network SDK

Published

on

By

resources-for-completing-app-store-data-practice-questionnaires-for-apps-that-include-the-facebook-or-audience-network-sdk

Resources for Completing App Store Data Practice Questionnaires for Apps That Include the Facebook or Audience Network SDK

First seen at developers.facebook.com

See also  ELI5: Fresco - Image Management Library for Android
Continue Reading

Trending