Connect with us

FACEBOOK

Signals in prod: dangers and pitfalls

Published

on

signals-in-prod:-dangers-and-pitfalls

In this blog post, Chris Down, a Kernel Engineer at Meta, discusses the pitfalls of using Linux signals in Linux production environments and why developers should avoid using signals whenever possible.

What are Linux Signals?

A signal is an event that Linux systems generate in response to some condition. Signals can be sent by the kernel to a process, by a process to another process, or a process to itself. Upon receipt of a signal, a process may take action.

Signals are a core part of Unix-like operating environments and have existed since more or less the dawn of time. They are the plumbing for many of the core components of the operating system—core dumping, process life cycle management, etc.—and in general, they’ve held up pretty well in the fifty or so years that we have been using them. As such, when somebody suggests that using them for interprocess communication (IPC) is potentially dangerous, one might think these are the ramblings of someone desperate to invent the wheel. However, this article is intended to demonstrate cases where signals have been the cause of production issues and offer some potential mitigations and alternatives.

Signals may appear attractive due to their standardization, wide availability and the fact that they don’t require any additional dependencies outside of what the operating system provides. However, they can be difficult to use safely. Signals make a vast number of assumptions which one must be careful to validate to match their requirements, and if not, one must be careful to configure correctly. In reality, many applications, even widely known ones, do not do so, and may have hard-to-debug incidents in the future as a result.

Let us look into a recent incident that occurred in the Meta production environment, reinforcing the pitfalls of using signals. We’ll go briefly over the history of some signals and how they led us to where we are today, and then we’ll contrast that with our current needs and issues that we’re seeing in production.

Advertisement
free widgets for website

The Incident

First, let’s rewind a bit. The LogDevice team cleaned up their codebase, removing unused code and features. One of the features that was deprecated was a type of log that documents certain operations performed by the service. This feature eventually became redundant, had no consumers and as such was removed. You can see the change here on GitHub. So far, so good.

The next little while after the change passed without much to speak about, production continued ticking on steadily and serving traffic as usual. A few weeks later, a report that service nodes were being lost at a staggering rate was received. It was something to do with the rollout of the new release, but what exactly was wrong was unclear. What was different now that had caused things to fall over?

The team in question narrowed the problem to the code change we mentioned earlier, deprecating these logs. But why? What’s wrong with that code? If you don’t already know the answer, we invite you to look at that diff and try to work out what’s wrong because it’s not immediately obvious, and it’s a mistake anyone could make.

logrotate, Enter the Ring

logrotate is more or less the standard tool for log rotation when using Linux. It’s been around for almost thirty years now, and the concept is simple: manage the life cycle of logs by rotating and vacuuming them.

logrotate doesn’t send any signals by itself, so you won’t find much, if anything, about them in the logrotate main page or its documentation. However, logrotate can take arbitrary commands to execute before or after its rotations. Just as a basic example from the default logrotate configuration in CentOS, you can see this configuration:

Advertisement
free widgets for website
 /var/log/cron /var/log/maillog /var/log/messages /var/log/secure /var/log/spooler {     sharedscripts     postrotate         /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true     endscript } 

A bit brittle, but we’ll forgive that and assume that this works as intended. This configuration says that after logrotate rotates any of the files listed, it should send SIGHUP to the pid contained in /var/run/syslogd.pid, which should be that of the running syslogd instance.

This is all well and good for something with a stable public API like syslog, but what about something internal where the implementation of SIGHUP is an internal implementation detail that could change at any time?

A History of Hangups

One of the problems here is that, except for signals which cannot be caught in user space and thus have only one meaning, like SIGKILL and SIGSTOP, the semantic meaning of signals is up to application developers and users to interpret and program. In some cases, the distinction is largely academic, like SIGTERM, which is pretty much universally understood to mean “terminate gracefully as soon as possible.” However, in the case of SIGHUP, the meaning is significantly less clear.

SIGHUP was invented for serial lines and was originally used to indicate that the other end of the connection had dropped the line. Nowadays, we still carry our lineage with us of course, so SIGHUP is still sent for its modern equivalent: where a pseudo or virtual terminal is closed (hence tools like nohup, which mask it).

In the early days of Unix, there was a need to implement daemon reloading. This usually consists at least of configuration/log file reopening without restarting, and signals seemed like a dependency-free way to achieve that. Of course, there was no signal for such a thing, but as these daemons have no controlling terminal, there should be no reason to receive SIGHUP, so it seemed like a convenient signal to piggyback onto without any obvious side effects.

Advertisement
free widgets for website

There is a small hitch with this plan though. The default state for signals is not “ignored,” but signal-specific. So, for example, programs don’t have to configure SIGTERM manually to terminate their application. As long as they don’t set any other signal handler, the kernel just terminates their program for free, without any code needed in user space. Convenient!

What’s not so convenient though, is that SIGHUP also has the default behavior of terminating the program immediately. This works great for the original hangup case, where these applications likely aren’t needed anymore, but is not so great for this new meaning.

This would be fine of course, if we removed all the places which could potentially send SIGHUP to the program. The problem is that in any large, mature codebase, that is difficult. SIGHUP is not like a tightly controlled IPC call for which you can easily grep the codebase for. Signals can come from anywhere, at any time, and there are few checks on their operation (other than the most basic “are you this user or have CAP_KILL“). The bottom line is that it’s hard to determine where signals could come from, but with more explicit IPC, we would know that this signal doesn’t mean anything to us and should be ignored.

See also  Facebook Is Working With Full Force On Its New Venture “Bulletin” To Jump Into Subscription ...

From Hangup to Hazard

By now, I suppose you may have started to guess what happened. A LogDevice release started one fateful afternoon containing the aforementioned code change. At first, nothing had gone awry, but at midnight the next day, everything mysteriously started falling over. The reason is the following stanza in the machine’s logrotate configuration, which sends a now unhandled (and therefore fatal) SIGHUP to the logdevice daemon:

 /var/log/logdevice/audit.log {   daily   # [...]   postrotate     pkill -HUP logdeviced   endscript } 

Missing just one short stanza of a logrotate configuration is incredibly easy and common when removing a large feature. Unfortunately, it’s also hard to be certain that every last vestige of its existence was removed at once. Even in cases that are easier to validate than this, it’s common to mistakenly leave remnants when doing code cleanup. Still, usually, it’s without any destructive consequences, that is, the remaining detritus is just dead or no-op code.

Advertisement
free widgets for website

Conceptually, the incident itself and its resolution are simple: don’t send SIGHUP, and spread LogDevice actions out more over time (that is, don’t run this at midnight on the dot). However, it’s not just this one incident’s nuances that we should focus on here. This incident, more than anything, has to serve as a platform to discourage the use of signals in production for anything other than the most basic, essential cases.

The Dangers of Signals

What Signals are Good For

First, using signals as a mechanism to affect changes in the process state of the operating system is well founded. This includes signals like SIGKILL, which are impossible to install a signal handler for and does exactly what you would expect, and the kernel-default behavior of SIGABRT, SIGTERM, SIGINT, SIGSEGV, and SIGQUIT and the like, which are generally well understood by users and programmers.

What these signals all have in common is that once you’ve received them, they’re all progressing towards a terminal end state within the kernel itself. That is, no more user space instructions will be executed once you get a SIGKILL or SIGTERM with no user space signal handler.

A terminal end state is important because it usually means you’re working towards decreasing the complexity of the stack and code currently being executed. Other desired states often result in the complexity actually becoming higher and harder to reason about as concurrency and code flow become more muddled.

Dangerous Default Behavior

You may notice that we didn’t mention some other signals that also terminate by default. Here’s a list of all of the standard signals that terminate by default (excluding core dump signals like SIGABRT or SIGSEGV, since they’re all sensible):

Advertisement
free widgets for website
  • SIGALRM
  • SIGEMT
  • SIGHUP
  • SIGINT
  • SIGIO
  • SIGKILL
  • SIGLOST
  • SIGPIPE
  • SIGPOLL
  • SIGPROF
  • SIGPWR
  • SIGSTKFLT
  • SIGTERM
  • SIGUSR1
  • SIGUSR2
  • SIGVTALRM

At first glance, these may seem reasonable, but here are a few outliers:

  • SIGHUP: If this was used only as it was originally intended, defaulting to terminate would be sensible. With the current mixed usage meaning “reopen files,” this is dangerous.
  • SIGPOLL and SIGPROF: These are in the bucket of “these should be handled internally by some standard function rather than your program.” However, while probably harmless, the default behavior to terminate still seems nonideal.
  • SIGUSR1 and SIGUSR2: These are “user-defined signals” that you can ostensibly use however you like. But because these are terminal by default, if you implement USR1 for some specific need and later don’t need that, you can’t just safely remove the code. You have to consciously think to explicitly ignore the signal. That’s really not going to be obvious even to every experienced programmer.

So that’s almost one-third of terminal signals, which are at best questionable and, at worst, actively dangerous as a program’s needs change. Worse still, even the supposedly “user-defined” signals are a disaster waiting to happen when someone forgets to explicitly SIG_IGN it. Even an innocuous SIGUSR1 or SIGPOLL may cause incidents.

This is not simply a question of familiarity. No matter how well you know how signals work, it’s still extremely hard to write signal-correct code the first time around because, despite their appearance, signals are far more complex than they seem.

Code flow, Concurrency, and the Myth of SA_RESTART

Programmers generally do not spend their entire day thinking about the inner workings of signals. This means that when it comes to actually implementing signal handling, they often subtly do the wrong thing.

I’m not even talking about the “trivial” cases, like safety in a signal handling function, which is mostly solved by only bumping a sig_atomic_t, or using C++’s atomic signal fence stuff. No, that’s mostly easily searchable and memorable as a pitfall by anyone after their first time through signal hell. What’s a lot harder is reasoning about the code flow of the nominal portions of a complex program when it receives a signal. Doing so requires either constantly and explicitly thinking about signals at every part of the application life cycle (hey, what about EINTR, is SA_RESTART enough here? What flow should we go into if this terminates prematurely? I now have a concurrent program, what are the implications of that?), or setting up a sigprocmask or pthread_setmask for some part of your application life cycle and praying that the code flow never changes (which is certainly not a good guess in an atmosphere of fast-paced development). signalfd or running sigwaitinfo in a dedicated thread can help somewhat here, but both of these have enough edge cases and usability concerns to make them hard to recommend.

We like to believe that most experienced programmers know by now that even a facetious example of correctly writing thread-safe code is very hard. Well, if you thought correctly writing thread-safe code was hard, signals are significantly harder. Signal handlers must only rely on strictly lock-free code with atomic data structures, respectively, because the main flow of execution is suspended and we don’t know what locks it’s holding, and because the main flow of execution could be performing non-atomic operations. They must also be fully reentrant, that is, they must be able to nest within themselves since signal handlers can overlap if a signal is sent multiple times (or even with one signal, with SA_NODEFER). That’s one of the reasons why you can’t use functions like printf or malloc in a signal handler because they rely on global mutexes for synchronization. If you were holding that lock when the signal was received and then called a function requiring that lock again, your application would end up deadlocked. This is really, really hard to reason about. That’s why many people simply write something like the following as their signal handling:

 static volatile sig_atomic_t received_sighup;   static void sighup(int sig __attribute__((unused))) { received_sighup = 1; }  static int configure_signal_handlers(void) {   return sigaction(     SIGHUP,     &(const struct sigaction){.sa_handler = sighup, .sa_flags = SA_RESTART},     NULL); }  int main(int argc, char *argv[]) {   if (configure_signal_handlers()) {        /* failed to set handlers */   }    /* usual program flow */    if (received_sighup) {     /* reload */     received_sighup = 0;   }    /* usual program flow */ }  

The problem is that, while this, signalfd, or other attempts at async signal handling might look fairly simple and robust, it ignores the fact that the point of interruption is just as important as the actions performed after receiving the signal. For example, suppose your user space code is doing I/O or changing the metadata of objects that come from the kernel (like inodes or FDs). In this case, you’re probably actually in a kernel space stack at the time of interruption. For example, here’s how a thread might look when it’s trying to close a file descriptor:

Advertisement
free widgets for website
# cat /proc/2965230/stack  [<0>] schedule+0x43/0xd0  [<0>] io_schedule+0x12/0x40  [<0>] wait_on_page_bit+0x139/0x230  [<0>] filemap_write_and_wait+0x5a/0x90  [<0>] filp_close+0x32/0x70  [<0>] __x64_sys_close+0x1e/0x50  [<0>] do_syscall_64+0x4e/0x140  [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Here, __x64_sys_close is the x86_64 variant of the close system call, which closes a file descriptor. At this point in its execution, we’re waiting for the backing storage to be updated (that’s this wait_on_page_bit). Since I/O work is usually several orders of magnitude slower than other operations, schedule here is a way of voluntarily hinting to the kernel’s CPU scheduler that we are about to perform a high-latency operation (like disk or network I/O) and that it should consider finding another process to schedule instead of the current process for now. This is good, as it allows us to signal to the kernel that it is a good idea to go ahead and pick a process that will actually make use of the CPU instead of wasting time on one which can’t continue until it’s finished waiting for a response from something that may take a while.

Imagine that we send a signal to the process we were running. The signal that we have sent has a user space handler in the receiving thread, so we’ll resume in user space. One of the many ways this race can end up is that the kernel will try to come out of schedule, further unwind the stack and eventually return an errno of ESYSRESTART or EINTR to user space to indicate that we were interrupted. But how far did we get in closing it? What’s the state of the file descriptor now?

Now that we’ve returned to user space, we’ll run the signal handler. When the signal handler exits, we’ll propagate the error to the user space libc’s close wrapper, and then to the application, which, in theory, can do something about the situation encountered. We say “in theory” because it’s really hard to know what to do about many of these situations with signals, and many services in production do not handle the edge cases here very well. That might be fine in some applications where data integrity isn’t that important. However, in production applications that do care about data consistency and integrity, this presents a significant problem: the kernel doesn’t expose any granular way to understand how far it got, what it achieved and didn’t and what we should actually do about the situation. Even worse, if close returns with EINTR, the state of the file descriptor is now unspecified:

“If close() is interrupted by a signal [...] the state of [the file descriptor] is unspecified.”

Good luck trying to reason about how to handle that safely and securely in your application. In general, handling EINTR even for well-behaved syscalls is complicated. There are plenty of subtle issues forming a large part of the reason why SA_RESTART is not enough. Not all system calls are restartable, and expecting every single one of your application’s developers to understand and mitigate the deep nuances of getting a signal for every single syscall at every single call site is asking for outages. From man 7 signal:

Advertisement
free widgets for website

“The following interfaces are never restarted after being interrupted by a signal handler, regardless of the use of SA_RESTART; they always fail with the error EINTR [...]”

Likewise, using a sigprocmask and expecting code flow to remain static is asking for trouble as developers do not typically spend their lives thinking about the bounds of signal handling or how to produce or preserve signal-correct code. The same goes for handling signals in a dedicated thread with sigwaitinfo, which can easily end up with GDB and similar tools being unable to debug the process. Subtly wrong code flows or error handling can result in bugs, crashes, difficult to debug corruptions, deadlocks and many more issues that will send you running straight into the warm embrace of your preferred incident management tool.

High Complexity in Multithreaded Environments

If you thought all this talk of concurrency, reentrancy and atomicity was bad enough, throwing multithreading into the mix makes things even more complicated. This is especially important when considering the fact that many complex applications run separate threads implicitly, for example, as part of jemalloc, GLib, or similar. Some of these libraries even install signal handlers themselves, opening a whole other can of worms.

Overall, man 7 signal has this to say on the matter:

“A signal may be generated (and thus pending) for a process as a whole (e.g., when sent using kill(2)) or for a specific thread [...] If more than one of the threads has the signal unblocked, then the kernel chooses an arbitrary thread to which to deliver the signal.”

Advertisement
free widgets for website

More succinctly, “for most signals, the kernel sends the signal to any thread that doesn’t have that signal blocked with sigprocmask“. SIGSEGV, SIGILL and the like resemble traps, and have the signal explicitly directed at the offending thread. However, despite what one might think, most signals cannot be explicitly sent to a single thread in a thread group, even with tgkill or pthread_kill.

This means that you can’t trivially change overall signal handling characteristics as soon as you have a set of threads. If a service needs to do periodic signal blocking with sigprocmask in the main thread, you need to somehow communicate to other threads externally about how they should handle that. Otherwise, the signal may be swallowed by another thread, never to be seen again. Of course, you can block signals in child threads to avoid this, but if they need to do their own signal handling, even for primitive things like waitpid, it will end up making things complex.

Just as with everything else here, these aren’t technically insurmountable problems. However, one would be negligent in ignoring the fact that the complexity of synchronization required to make this work correctly is burdensome and lays the groundwork for bugs, confusion and worse.

Lack of Definition and Communication of Success or Failure

Signals are propagated asynchronously in the kernel. The kill syscall returns as soon as the pending signal is recorded for the process or thread’s task_struct in question. Thus, there’s no guarantee of timely delivery, even if the signal isn’t blocked.

Even if there is timely delivery of the signal, there’s no way to communicate back to the signal issuer what the status of their request for action is. As such, any meaningful action should not be delivered by signals, since they only implement fire-and-forget with no real mechanism to report the success or failure of delivery and subsequent actions. As we’ve seen above, even seemingly innocuous signals can be dangerous when they are not configured in user space.

Advertisement
free widgets for website

Anyone using Linux for long enough has undoubtedly run into a case where they want to kill some process but find that the process is unresponsive even to supposedly always fatal signals like SIGKILL. The problem is that misleadingly, kill(1)’s purpose isn’t to kill processes, but just to queue a request to the kernel (with no indication about when it will be serviced) that someone has requested some action to be taken.

The kill syscall’s job is to mark the signal as pending in the kernel’s task metadata, which it does successfully even when a SIGKILL task doesn’t die. In the case of SIGKILL in particular, the kernel guarantees that no more user mode instructions will be executed, but we may still have to execute instructions in kernel mode to complete actions that otherwise may result in data corruption or to release resources. For this reason, we still succeed even if the state is D (uninterruptible sleep). Kill itself doesn’t fail unless you provided an invalid signal, you don’t have permission to send that signal or the pid that you requested to send a signal to does not exist and is thus not useful to reliably propagate non-terminal states to applications.

In Conclusion

  • Signals are fine for terminal state handled purely in-kernel with no user space handler. For signals that you actually would like to immediately kill your program, leave those signals alone for the kernel to handle. This also means that the kernel may be able to exit early from its work, freeing up your program resources more quickly, whereas a user space IPC request would have to wait for the user space portion to start executing again.
  • A way to avoid getting into trouble handling signals is to not handle them at all. However, for applications handling state processing that must do something about cases like SIGTERM, ideally use a high-level API like folly::AsyncSignalHandler where a number of the warts have already been made more intuitive.

  • Avoid communicating application requests with signals. Use self-managed notifications (like inotify) or user space RPC with a dedicated part of the application life cycle to handle it instead of relying on interrupting the application.
  • Where possible, limit the scope of signals to a subsection of your program or threads with sigprocmask, reducing the amount of code that needs to be regularly scrutinized for signal-correctness. Bear in mind that if code flows or threading strategies change, the mask may not have the effect you intended.
  • At daemon start, mask terminal signals that are not uniformly understood and could be repurposed at some point in your program to avoid falling back to kernel default behavior. My suggestion is the following:
 signal(SIGHUP, SIG_IGN); signal(SIGQUIT, SIG_IGN); signal(SIGUSR1, SIG_IGN); signal(SIGUSR2, SIG_IGN); 

Signal behavior is extremely complicated to reason about even in well-authored programs, and its use presents an unnecessary risk in applications where other alternatives are available. In general, do not use signals for communicating with the user space portion of your program. Instead, either have the program transparently handle events itself (for example, with inotify), or use user space communication that can report back errors to the issuer and is enumerable and demonstrable at compile time, like Thrift, gRPC or similar.

I hope this article has shown you that signals, while they may ostensibly appear simple, are in reality anything but. The aesthetics of simplicity that promote their use as an API for user space software belie a series of implicit design decisions that do not fit most production use cases in the modern era.

Let’s be clear: there are valid use cases for signals. Signals are fine for basic communication with the kernel about a desired process state when there’s no user space component, for example, that a process should be killed. However, it is difficult to write signal-correct code the first time around when signals are expected to be trapped in user space.

Signals may seem attractive due to their standardization, wide availability and lack of dependencies, but they come with a significant number of pitfalls that will only increase concern as your project grows. Hopefully, this article has provided you with some mitigations and alternative strategies that will allow you to still achieve your goals, but in a safer, less subtly complex and more intuitive way.

Advertisement
free widgets for website

To learn more about Meta Open Source, visit our open source site, subscribe to our YouTube channel, or follow us on Twitter, Facebook and LinkedIn.

First seen at developers.facebook.com

FACEBOOK

Now people can share directly to Instagram Reels from some of their favorite apps

Published

on

By

now-people-can-share-directly-to-instagram-reels-from-some-of-their-favorite-apps

More people are creating, sharing and watching Reels than ever before. We’ve seen the creator community dive deeply into video content – and use it to connect with their communities. We’re running a limited alpha test that lets creators share video content directly from select integrated apps to Instagram Reels. Now, creators won’t be interrupted in their workflow, making it easier for them share share and express themselves on Reels.

“With the shift to video happening across almost all online platforms, our innovative tools and services empower creativity and fuel the creator economy and we are proud to be able to offer a powerful editing tool like Videoleap that allows seamless content creation, while partnering with companies like Meta to make sharing content that much easier.”- Zeev Farbman, CEO and co-founder of Lightricks.

Starting this month, creators can share short videos directly to Instagram Reels from some of their favorite apps, including Videoleap, Reface, Smule, VivaVideo, SNOW, B612, VITA and Zoomerang, with more coming soon. These apps and others also allow direct sharing to Facebook , which is available for any business with a registered Facebook App to use.

We hope to expand this test to more partners in 2023. If you’re interested in being a part of that beta program, please fill out this form and we will keep track of your submission. We do not currently have information to share about general availability of this integration.

Learn more here about sharing Stories and Reels to Facebook and Instagram and start building today.

Advertisement
free widgets for website

FAQs

Q. What is the difference between the Instagram Content Publishing API and Instagram Sharing to Reels?

See also  Doctors Selling Florida COVID-19 Vulnerability Form On Facebook

A: Sharing to Reels is different from the Instagram Content Publishing API, which allows Instagram Business accounts to schedule and publish posts to Instagram from third-party platforms. Sharing to Reels is specifically for mobile apps to display a ‘Share to Reels’ widget. The target audience for the Share to Reels widget is consumers, whereas the Content Publishing API is targeted towards businesses, including third-party publishing platforms such as Hootsuite and Sprout Social that consolidate sharing to social media platforms within their third-party app.

Q: Why is Instagram partnering with other apps?

A: Creators already use a variety of apps to create and edit videos before uploading them to Instagram Reels – now we’re making that experience faster and easier. We are currently doing a small test of an integration with mobile apps that creators know and love, with more coming soon.

Q: How can I share my video from another app to Reels on Instagram?

Advertisement
free widgets for website

A: How it works (Make sure to update the mobile app you’re using to see the new Share to Reels option):

  • Create and edit your video in one of our partner apps
  • Once your video is ready, tap share and then tap the Instagram Reels icon
  • You will enter the Instagram Camera, where you can customize your reel with audio, effects, Voiceover and stickers. Record any additional clips or swipe up to add an additional clip from your camera roll.
  • Tap ‘Next’ to add a caption, hashtag, location, tag others or use the paid partnerships label.
  • Tap ‘Share’. Your reel will be visible where you share reels today, depending on your privacy settings.
See also  Facebook News launches in Australia

Q: How were partners selected?

A. We are currently working with a small group of developers that focus on video creation and editing as early partners. We’ll continue to expand to apps with other types of creation experiences.

Q: When will other developers be able to access Sharing to Reels on Instagram?

A: We do not currently have a date for general availability, but are planning to expand further in 2023.

Q: Can you share to Facebook Reels from other apps?

Advertisement
free widgets for website

A: Yes, Facebook offers the ability for developers to integrate with Sharing to Reels. For more information on third-party sharing opportunities, check out our entire suite of sharing offerings .

First seen at developers.facebook.com

Continue Reading

FACEBOOK

What to know about Presto SQL query engine and PrestoCon

Published

on

By

what-to-know-about-presto-sql-query-engine-and-prestocon

The open source Presto SQL query engine is used by a diverse set of companies to navigate increasingly large data workflows. These companies are using Presto in support of e-commerce, cloud, security and other areas. Not only do many companies use Presto, but individuals from those companies are also active contributors to the Presto open source community.

In support of that community, Presto holds meetups around the world and has an annual conference, PrestoCon, where experts and contributors gather to exchange knowledge. This year’s PrestoCon, hosted by the Linux Foundation, takes place December 7-8 in Mountain View, CA. This blog post will explore some foundational elements of Presto and what to expect at this year’s PrestoCon.

What is Presto?

Presto is a distributed SQL query engine for data platform teams. Presto users can perform interactive queries on data where it lives using ANSI SQL across federated and diverse sources. Query engines allow data scientists and analysts to focus on building dashboards and utilizing BI tools so that data engineers can focus on storage and management, all while communicating through a unified connection layer.

In short, the scientist does not have to consider how or where data is stored, and the engineer does not have to optimize for every use case for the data sources they manage. You can learn more about Presto in a recent ELI5 video below.

Caption: Watch the video by clicking on the image above.

Advertisement
free widgets for website

Presto was developed to solve the problem of petabyte-scale, multi-source data queries taking hours or days to return. These resources and time constraints make real-time analysis impossible. Presto can return results from those same queries in less than a second in most cases, allowing for interactive data exploration.

See also  Doctors Selling Florida COVID-19 Vulnerability Form On Facebook

Not only is it highly scalable, but it’s also extensible, allowing you to build your own connector for any data source Presto does not already support. At a low level, Presto also supports a wide range of file types for query processing. Presto was open sourced by Meta and later donated to the Linux Foundation in September of 2019.

Here are some Presto resources for those who are new to the community:

What is PrestoCon?

PrestoCon is held annually in the Bay Area and hosted by the Linux Foundation. This year, the event takes place December 7-8 at the Computer History Museum. You can register here. Each year at PrestoCon, you can hear about the latest major evolutions of the platform, how different organizations use Presto and what plans the Technical Steering Committee has for Presto in the coming year.

Presto’s scalability is especially apparent as every year we hear from small startups, as well as industry leaders like Meta and Uber, who are using the Presto platform for different use cases, whether those are small or large. If you’re looking to contribute to open source, PrestoCon is a great opportunity for networking as well as hearing the vision that the Technical Steering Committee has for the project in the coming year.

Advertisement
free widgets for website

Explore what’s happening at PrestoCon 2022:

Where is Presto used?

Since its release in November of 2013, Presto has been used as an integral part of big data pipelines within Meta and other massive-scale companies, including Uber and Twitter.

The most common use case is connecting business intelligence tools to vast data sets within an organization. This enables crucial questions to be answered faster and data-driven decision-making can be more efficient.

How does Presto work?

First, a coordinator takes your statement and parses it into a query. The internal planner generates an optimized plan as a series of stages, which are further separated into tasks. Tasks are then assigned to workers to process in parallel.

Workers then use the relevant connector to pull data from the source.

Advertisement
free widgets for website

The output of each task is returned by the workers, until the stage is complete. The stage’s output is returned by the final worker towards the next stage, where another series of tasks must be executed.

The results of stages are combined, eventually returning the final result of the original statement to the coordinator, which then returns to the client.

How do I get involved?

To start using Presto, go to prestodb.io and click Get Started.

We would love for you to join the Presto Slack channel if you have any questions or need help. Visit the community page on the Presto website to see all the ways you can get involved and find other users and developers interested in Presto.

If you would like to contribute, go to the GitHub repository and read over the Contributors’ Guide.

Advertisement
free widgets for website

Where can I learn more?

To learn more about Presto, check out its website for installation guides, user guides, conference talks and samples.

Make sure you check out previous Presto talks, and attend the annual PrestoCon event if you are able to do so.

To learn more about Meta Open Source, visit our open source site, subscribe to our YouTube channel, or follow us on Twitter, Facebook and LinkedIn.

First seen at developers.facebook.com

Advertisement
free widgets for website
Continue Reading

FACEBOOK

How to Interpret Webhook Components in the WhatsApp Business Platform

Published

on

By

how-to-interpret-webhook-components-in-the-whatsapp-business-platform

The ways customers want to connect are changing. The WhatsApp Business Platform gives businesses an integrated way to communicate with customers right where they are. In order to integrate properly when using the Cloud API, hosted by Meta, you’ll need to leverage webhooks so applications have a way to respond to events. Webhooks allow your application to monitor three primary events on WhatsApp so you can react with different functionality depending on your goals.

This article looks at these three components, goes through the information they carry, and provides some use-case scenarios to give you an idea of the possibilities.

Interpreting Different Webhook Components

To send and receive messages on WhatsApp, it’s critical to keep track of statuses and errors to help ensure you’re communicating effectively with your customers, which you can do with webhooks.

With webhooks, the WhatsApp Business Platform monitors events and sends notifications when one occurs. These events are one of three components: messages, statuses, and errors.

Let’s explore each of these and examine examples of how you can use them.

Advertisement
free widgets for website

Messages

The messages component is the largest of the three event types and contains two core objects:

  • Contacts — which contain information about the message’s sender.

  • Messages — which provide information about a message’s type and contents.

These two event types allow your application to manage and respond to people that interact with your application. The contacts object contains two pieces of information: name and WhatsApp Id. The contact’s name allows your application to use their name without further lookups. In contrast, the contact’s WhatsApp ID lets you keep track of these contacts or use the contacts/ endpoint to add additional functionality.

For instance, you can verify the customer and start the opt-in process within the customer-initiated conversation, which allows you to message them outside the initial 24-hour response window. It’s important to note that only the text, contacts, and location message types provide contact information.

The message object is where the bulk of the information is stored, including the message contents, type of message, and other relevant information. Depending on the message type, the actual payload of the message component can vary widely. It’s crucial to determine the message type to understand the potential payload. Message types include:

  • Text: a standard text-only message

  • Contact: contains a user’s full contact details

  • Location: address, latitude, and longitude

  • Unknown: unsupported messages from users, which usually contain errors.

  • Ephemeral: disappearing messages

  • Media message types: contain information for the specified media file. These types include:

    • Document

    • Image

    • Audio

    • Video

    • Voice

These different data types can have very different uses, from reviewing images and screenshots from concerned customers to collecting information about where to ship goods and send services. To use these different data types most effectively, you can create applications to handle different forms of communication, with functionalities such as:

Advertisement
free widgets for website
  • Ask your customers to provide a shipping or mailing address. You can use the location-based message feature to capture your users’ location to determine where to send their goods and services.

  • Show customers products and communicate product details through a message. You can use the referred_product field within messages to offer your users specific product details. Using this field develops a more personal, conversational shopping experience and customer interactions.

  • Build support functionality that allows customers to take and send images and videos of product concerns, and submit those for a support case. Once the user has submitted a support ticket, the app can track the case — including steps taken towards resolution and conversations between support teams and the customer through WhatsApp — using a unique case identifier.

These are just some potential features you can build using the interactivity provided by webhooks and the message object. These features extend your current communication channels and provide additional options for customers.

Statuses

Where the messages component provides your application with insight into events that originate directly from your customers, the statuses component keeps track of the results of messages you send and the conversation history. There are six status components:

  • Sent: the application sent your message and is in transit.
  • Delivered: the user’s device successfully received the message.
  • Read: the user has read your message.
  • Deleted: a user deleted a message that you sent.
  • Warning: a message sent by your application contains an item that isn’t available or doesn’t exist.
  • Failed: a message sent by your application failed to arrive.

Status components also contain information on the recipient ID, the conversation, and the pricing related to the current conversation. Conversations on WhatsApp are a grouping of messages within a 24-hour window that are either user-initiated or business-initiated. Keeping track of these conversations is vital, as a new conversation occurs when you send additional responses after the 24-hour period ends.

Some functionality you may want to add to your application based on status events includes:

  • Ensuring your application has sent generated messages, they arrived, and the recipient potentially read them by using a combination of these status types and timestamps within the status object. This information allows your application to follow up with customers if they didn’t engage.
  • Keep analytical information about your application’s messages, especially regarding business-initiated conversations. For example, if your application uses a WhatsApp customer contact list to send offer messages, the status component helps you understand how many were sent, delivered, read, responded to, or failed to measure your campaign’s success.

Errors

Finally, the errors component allows your application to receive any out-of-band errors within WhatsApp that affect your platform. These errors don’t stop your application from compiling or working but are typically caused when your application is misusing specific functionality. The following are some typical errors.

Error Code 368, Temporarily Blocked for Policy Violations

If your application violates WhatsApp Business Messaging or Commerce policy, your account may be temporarily banned. You can monitor this and pause your application while troubleshooting.

Error 506, Duplicate Post

If your workflows unintentionally generate duplicate messages, you can monitor this to find the source.

Advertisement
free widgets for website

Error 131043, Message Expired

Sometimes, messages are not sent during their time to live (TTL) duration. Use this code to know which messages to schedule for resending if needed.

Error handling is a broad, complex subject, and there are many other use cases for which you should be implementing error handling. The errors component helps extend your error handling on the WhatsApp Business Platform for greater consistency.

Conclusion

This article took a high-level look at messages, statuses, and errors returned by webhooks and explored ways you can use these three components to expand your application’s functionality.

Messages provide information on customer interactions, statuses give insight into messages your app sends, and error notices enable you to increase your application’s resilience. Webhooks are critical to ensuring your app interacts with customers seamlessly.

The WhatsApp Business Platform’s webhooks provide your applications with real-time data, enabling you to build better experiences as you interact with customers. Ready to know more? Dive deeper into everything the WhatsApp Business Platform has to offer.

Advertisement
free widgets for website

First seen at developers.facebook.com

See also  Meet the Developers: Linux Kernel Team (Song Liu)
Continue Reading

Trending