Congress is considering legislation (“the ACCESS Act”) that mandates interoperability in an effort to stimulate competition in digital markets such as social networking. However, as currently written, the legislation is likely to fail in its objective. The reason is that it ignores one of the crucial forces that has allowed firms such as Meta to remain at the top of the social networking space: the indirect network effects from their rich streams of user-generated data that allow them to curate highly engaging content for their users. Moreover, privacy considerations do not justify the strong restrictions on the use of data by firms that interoperate with dominant platforms. Targeted changes to the language of the bill that articulate what I call a “data symmetry principle” that considers privacy would allow entrant platforms to benefit from the rich types of data generated by dominant platforms. This would put would-be entrants on a more level playing field when it comes to scale advantages due to data.

By Cristian Santesteban[1]

 

I. INTRODUCTION

In a rare bipartisan effort, Congress is considering legislation (“the ACCESS Act”) whose purpose is to stimulate competition in digital markets by facilitating the successful entry of new platforms. Over time, the idea is that interoperable networks would erode the significant market power of dominant incumbent platforms such as Meta’s Facebook.[2] While I applaud this effort, and I think that interoperability in this space could be beneficial for generating greater competition, the problem with both the House and Senate bills as written is that they restrict too severely what entrant platforms can do with the data that is shared with them by the dominant platforms. I focus in this article on the social networking space. (For clarity and simplicity, in much of this article I will speak of Meta’s social network, Facebook, as a proxy for a dominant social networking platform or in the language of the bill, a “covered platform.”)

In particular, Section 4.f.2 of the House version of the ACCESS Act imposes the following strict “data minimization” requirement on any entrant platform seeking to interoperate with a covered platform (Section 4f.2):

“(2) NON-COMMERCIALIZATION OF DATA ON A COVERED PLATFORM .—A business user [i.e., an entrant platform seeking to interoperate with a covered platform] shall not collect, use, or share the data of a user on a covered platform except for the purposes of safeguarding [the] security of such data or maintaining interoperability of services.” (emphasis mine)[3]

The legislation as written does not explicitly allow for new entrants to benefit from the rich data generated by Facebook users at all. While I do not object to the restriction on sharing the data obtained from the covered platform with third parties, I believe that a strong restriction on collecting and using that data would severely impede the ability of new networks to recommend engaging content for its users and establish themselves as viable competitors to Facebook and other dominant platforms.[4]

 

II. INDIRECT NETWORK EFFECTS FROM USER-GENERATED DATA

In understanding this point, it is worth exploring what currently makes successful entry so difficult for an upstart social network. The dominant platform, Facebook, enjoys two key competitive advantages over any entrant. First, it enjoys direct network effects on the user side. It is valuable to be on Facebook because so many other users are on Facebook. This is a familiar story and the one that motivates most discussions of interoperability in social networking. As Section 6.c.1 of the House version of the bill makes clear, it is also the motivating argument for the ACCESS Act. This section specifies the Federal Trade Commission’s (“FTC’s”) mandate when adopting standards to implement this legislation:

“the Commission shall seek to encourage entry by reducing or eliminating the network effects that limit competition with the covered platform…”

The legislation as currently written reduces the proprietary direct network effects enjoyed by covered platforms such as Facebook because consumers would be able to switch from Facebook to a new social network and still maintain their friends on Facebook. New users of any interconnected platform would also be able to make friends with Facebook users. As such, a consumer can benefit from a large network of friends that transcends a particular platform. The legislation would help to spread what previously had been proprietary direct network effects to the whole market (at least to the set of firms in the market interconnected with one another). This should be helpful for new entrants in overcoming the direct network effects currently enjoyed by Facebook. All firms that interoperate with Facebook (and with one another) would benefit from this demand enhancing force.

However, there is a second, less commonly discussed force at work that allows Facebook to remain a dominant social network: the indirect network effects arising from the learning that occurs from additional user-generated data, or, alternatively, the increasing returns to scale to data in complex AI applications.[5] And this last term doesn’t entirely do justice to the competitive advantages arising from having access to a large continuous stream of rich user data since increasing returns in economics has tended to focus on output. In the context of digital platforms, the notion of increasing returns to data refers more broadly to other dimensions of competition such as increased product differentiation and higher quality and more personalized content. New successful firms such as TikTok are powered almost entirely by recommendation systems running off user-generated data. The direct network effects due to many of one’s friends being on the network are minimal. And this is increasingly the case also for Instagram, and its video feature, Reels.

Facebook’s billions of users generate a tremendous amount of rich data – likes, comments, posts, searches, click-throughs, and less obvious information such as the amount of time users hover over a post or video. This detailed information allows Facebook (through its AI algorithms) to learn about and accurately predict the preferences of its users.[6] Indeed, user engagement metrics such as the ones listed above are how the news feed is optimized, which is its most important feature and what keeps Facebook users addicted and scrolling for more. Large amounts of user data allow Facebook to directly observe what individual users like and how they respond to content in their feeds and elsewhere on their platform. In addition, the rich user data allows Facebook to match users with others who have liked and reacted similarly to the same content in the past. This allows Facebook to predict how these users will react when faced with new content that their matched counterparts already interacted with. In fact, Facebook appears to consider “over 100,000 highly personalized factors when determining what’s shown to a user.”[7] Furthermore, Facebook engages in experimentation on a massive scale in a way only possible with the huge amount of data at its disposal. This allows Facebook to experiment with “what the user sees and interacts with on a page” in a way that is “intended to make the content more compelling for all users and allows for more personalized content for each user.”[8] All of this allows Facebook to generate an engaging feed that keeps users on its site.[9] A small network without a lot of users and their user-generated data cannot and will not be able to do this very well. As my co-author and I put it in another article, an entrant faces a chicken and egg problem: “Without a critical mass of data, potential entrants cannot compete along the critical dimensions to attract users; and without sufficient users, they don’t have the data (and in fact, the data may be “in use” by its incumbent rival).”[10]                                                

This phenomenon is something altogether different from the direct network effects due to one’s friends all being on the same network. Most modern social networks have evolved to be entertainment focused, not depending nearly as much on friend networks. Examples of these types of networks are TikTok, YouTube, Spotify, Reddit, and of course, as I mentioned above, increasingly Instagram and Reels. Even if we allow for friends to remain connected on separate networks, in the short to medium run at least, Facebook will continue to benefit from having an incredibly large network of users who are constantly generating data for Facebook to learn from and improve the quality of what it offers its users. Opening up Facebook’s network without allowing entrant platforms to benefit from the vastness and richness of Facebook’s data is a massive missed opportunity that will likely lead to a disappointing outcome for competition in social networking and will result in a lack of faith in regulatory intervention going forward. Moreover, this could have an adverse effect not just on competition in social networks as we know them now on digital computer and phone screens, but also in the future when social networking expands to new realms such as the metaverse. A Meta controlled social media landscape now could lead to an entrenchment of its market power in the future as well.[11]

 

III. RECOMMENDATIONS

For an interoperability intervention to be successful, it must directly target both of these forces: direct network effects and increasing returns to data. As mentioned earlier, the current language of the ACCESS Act addresses only the former, and, in fact, explicitly impairs the ability of entrant platforms from exploiting the full benefits of being interconnected with the largest social networks in the world.

The language restricting the collection and use of data from the covered platform may have been included in a well-intentioned effort to allay privacy concerns.[12] However, to better balance the benefits to competition (and ultimately to consumers) with privacy considerations, this restriction should be relaxed and complemented by a plan to deal specifically with privacy issues. As written, the legislation would deny entrant platforms the possibility of unleashing more of the power of their AI algorithms and prevent them from offering higher quality services that could have a better chance of pulling users away from dominant platforms.

I recommend relaxing the language of Sec. 4.f.2 to read simply: “A business user shall not share the data of a user on a covered platform…” Further, I suggest that the bills make explicit that the entrant platform should be able to collect and use the data generated by users of the covered platform to improve its algorithms and develop better services for its users, subject to specific limitations arising from a “data symmetry principle” that I describe below.[13] Language along these lines could be as follows:

 “A business user will be granted permission to collect and use the data of a user on a covered platform for purposes of learning about and generating content for its own users according to a ‘data symmetry’ principle described in Sec. XYZ below.[14]

Further Section 6.c.1 of the House bill should be altered to explicitly reflect the goals of the FTC in implementing this legislation. In particular, the twin goals should be to transform the proprietary nature of the direct (user-based) and indirect (data-based) network effects and render them accessible to all interconnected firms in the market. I suggest revising the text of Section 6.c.1 to:

“the Commission shall seek to encourage entry by reducing or eliminating the proprietary direct (user-based) and indirect (data-based) network effects that limit competition with the covered platform…”

These changes to the text of the legislation (along with an articulation of the data symmetry principle as I show below) would make the interoperability regime better able to target the dual forces that currently allow Facebook to remain the dominant social network in the marketplace. By allowing all users to be connected regardless of platform and the data those users generate to be used by all interconnected platforms (subject to privacy limitations described below), we transform the proprietary forces that made dominant firms like Facebook so formidable into market-level forces that will strengthen not just one firm in this sector, but the social networking space as a whole. This would be a win for consumers who will benefit from greater choice, more innovation, and higher quality offerings.

 

IV. PRIVACY CONSIDERATIONS AND THE DATA SYMMETRY PRINCIPLE

An entrant social network should be able to benefit from the data generated by some, but not all, of the users of a covered platform – this is not supposed to be a free-for-all for interconnecting platforms. In particular, the interoperability regime should follow a “data symmetry” principle. This principle says any content generated by Facebook users that a user on Facebook can potentially interact with (subject to those users’ privacy restrictions), should be made available for an entrant platform to collect and use, were that user to reside instead on the entrant platform, rather than on Facebook itself.[15]

To clarify, imagine a Facebook user called A. Facebook generates a feed for User A based on that user’s interactions with content generated by other Facebook users (as well as other data Facebook collects). These Facebook users may be friends of User A, friends of friends, or anyone on Facebook, depending on the privacy settings of those users. Now imagine that User A leaves Facebook, switches to an entrant network interoperating with Facebook, and retains or reestablishes all of their friends on Facebook. What user-generated data is the entrant platform able to collect and use from Facebook for purposes of learning about User A?[16] The data symmetry principle would grant an entrant platform the ability to collect and use whatever content from Facebook users User A would have been eligible to interact with, had User A remained a Facebook user.[17] Another way of looking at this is that as long as a user on an interconnecting platform is interacting with content from Facebook users (which should mean that the privacy choices of the Facebook users allow that user to interact with the content), the interconnecting platform should be able to collect and use that user-generated data to learn about its own users’ preferences.[18]

I therefore propose adding the following clause to the legislation that articulates the data symmetry principle that sets bounds on the ability of the entrant platform to collect and use data from a covered platform:

“Section XYZ: Data Symmetry Principle. A business user can collect and use any data generated by users of a covered platform that would be eligible to be shown to a user of a covered platform, if that user were instead to be part of the business user’s platform.”

This would allow a new entrant with few initial users to benefit from much more user-generated data than the data generated simply by its own users. Modifying the bill in this manner could add a multiplier effect on the order of 10-100x to the amount of data that the entrant platform would use to learn and improve its recommendation algorithms. The amplification effect depends on how many friends on covered platforms the entrant platforms’ users have.

To make this more concrete, I describe how an entrant platform could learn from the data generated by Facebook users under the data symmetry principle. Suppose we have three users of social networks: Cristian, Frank, and Fatima. Cristian is a user on an entrant platform that has chosen to interoperate with Facebook; it also competes with Facebook for users and attention. Cristian is directly linked to Frank, a Facebook user, because they have become friends.[19] Another Facebook user, Fatima, is indirectly linked to Cristian because she is friends with Frank but not with Cristian. Based on the data symmetry principle, the entrant platform should be able to collect and use any data that anybody on Facebook generates that a user on the entrant platform could view and respond to.[20] At a minimum, this includes, but should not be limited to, the following cases:

  1. Suppose that Cristian from the entrant platform posts some content. Suppose further that Frank responds to it, and this response appears on Cristian’s feed. Since Frank is Cristian’s friend, the data from Frank’s response should be allowed to be collected and used by the entrant platform to learn about Cristian’s preferences.
  2. Now suppose Frank posts something on Facebook. That post could be shared with all of Frank’s friends on the new platform. The new platform would be able to collect and use the data from Frank’s post to observe how its own users interact with it. Let’s say Cristian likes the post.[21] It would not be very useful for the new platform to just observe that one of its users liked some content. It must be able to observe the actual content that its user liked. That makes it crucial for the new platform to be able to collect and use the data from Frank’s post to be able to decipher what its own user Cristian was responding to.
  3. Further, imagine that Facebook user Fatima comments on a post generated by Frank, and Cristian likes Fatima’s comment. The entrant platform should be able to learn about Cristian’s preferences because Cristian has interacted with content generated by Fatima, who’s only indirectly linked to Cristian as a friend of a friend. The entrant platform should be able to collect and use Fatima’s comment so that it can interpret its own users’ response to it, here Cristian’s like. As in the prior example, if the new platform could not collect and use the data from Fatima, then it would only be able to observe that Cristian liked some content, but not be able to see what content the like was in response to. This would severely impair the new platform’s algorithms from learning much of anything about Cristian’s preferences.
  4. Finally, suppose Facebook user Fatima posts something on Facebook. Her friend Frank responds to it. This content wouldn’t normally appear on Cristian’s feed; however, Cristian could seek it out by going to his friend Frank’s profile. Typically, Fatima can choose in her Facebook setting whether her posts are public or restricted only to friends (or a subset of friends). If Fatima chooses to restrict her posts to be viewed only by friends, then Cristian should not be allowed to interact with this content, and neither should the entrant platform be allowed to collect and use this data. On the other hand, if Fatima chooses to make her posts public, then Cristian could choose to go to Frank’s feed (or Fatima’s) and interact with this post. In this case, Fatima’s content should also be eligible for the new platform to collect and use in order to learn more about Cristian’s preferences.

I now turn to the implications for privacy of relaxing the restrictions on collection and use of data by interoperating platforms. Legislators are rightly concerned about the possibility that a new entrant interconnecting with a covered platform might violate the covered platform users’ privacy. Such a violation could be accomplished if an entrant platform were to share covered platform users’ information with unlicensed third-parties or by selling it to advertisers. The example of Cambridge Analytica and Facebook is often brought up as the nightmare scenario. However, allowing platforms that interoperate with covered platforms such as Facebook to collect and use the data from Facebook users as I described above, would not raise these kinds of privacy concerns for the following three reasons:

  1. Facebook would not simply be sharing data with any platform. All platforms that wish to interconnect with Facebook would have to be reviewed and licensed by a technical body chosen by the FTC before being able to interconnect with Facebook. This should ensure that platforms with intentions simply to exploit user data for profit and not to provide legitimate services to its users would not be able to interoperate. For example, this would exclude firms set up solely for the purpose of harvesting user data from interconnecting with covered platforms.
  2. No data from Facebook would be allowed to be monetized by any interconnecting platforms in the form of targeted advertising (but these firms could use that data to optimize their recommendation algorithms to show organic content to their users; that is the key point of this article).[22]
  3. No data from Facebook would be allowed to be shared with third-party firms.

Of course, there exist privacy concerns beyond those involving sharing of data with third parties. As mentioned above, Facebook users can limit the users who can interact with content that they post online. Consistent with the data symmetry principle, whatever restrictions on the use of personal data a user has on their home platform should also apply to any interconnected platform.[23] (This is illustrated in Case 4 above.) If Facebook user Fatima posts content and only shares it with her friends, current Facebook policy would prohibit other Facebook users who are not Fatima’s friends from viewing or responding to Fatima’s post. The data symmetry principle would extend to these privacy restrictions and require an entrant platform to adhere to the privacy preferences of the Facebook users whose data it may obtain. In other words, entrant user Cristian would not be able to see and interact with content posted by Fatima if her settings are such that only friends can view her posts. Correspondingly, the entrant platform should not be able to collect and use Fatima’s posts for purposes of learning about Cristian’s preferences, as the two of them are not friends.

 

V. CONCLUSION

In sum, allowing entrant platforms that interoperate with dominant incumbent platforms to collect and use the data generated by users on those dominant platforms as discussed above will empower them to learn about the preferences of its own users more effectively and thus generate more relevant and engaging content. This would increase the competitive viability of the entering networks and allow them to be stronger competitors to the dominant incumbents. In this manner, the indirect network effects from data only enjoyed by Facebook and other dominant firms would be spread to all the firms that choose to interoperate with them. This could be transformative in altering the competitive dynamics in the social networking space for the benefit of consumers.


[1] Santesteban is Founder and CEO of RedPeak Economics Consulting and Affiliate Lecturer in the Department of Economics at the University of Washington.

[2] Section 4a of the House bill lays out this general mandate: “(a) In General.—A covered platform [e.g. Facebook] shall maintain a set of transparent, third-party-accessible interfaces (including application programming interfaces) to facilitate and maintain interoperability with a competing business or a potential competing business that complies with the standards issued pursuant to section 6(c).”

[3] The text does mention that one exception to this restriction is the maintenance of interoperability of services. However, that is extremely vague and could be interpreted simply as the passive transmission of information so that two competing networks could interconnect. More detail as to what data can be collected and used by the interconnecting platforms needs to be specified, in part in the bills themselves, and also by the technical committee that will have to implement this bill. This article attempts to provide guidance for both modifying the legislation and assisting the technical committee.

[4] Of course, there cannot be a complete restriction on the use of data from the covered platform even as it is written now or the entrant platform would not be able to show content from Facebook users to its users. Further, some restrictions on the collection and use of data from the covered platform may be warranted, e.g. if the covered platform’s users have made strict choices on how they want their data to be used on the covered platform itself.

[5] See, e.g. Cristian Santesteban & Shayne Longpre, How Big Data Confers Market Power to Big Tech: Leveraging the Perspective of Data Science, 65 ANTITRUST BULLETIN 3, 2 (2020) (“In these cases, the primary competitive dimension is directly contingent upon the scale and quality of data. A rival firm could match or even exceed the incumbent’s product on a number of competitive dimensions (user-interface design, marketing, business strategy, and engineering), but without access to the incumbent’s data or user base, their data-dependent applications will not be competitive.”) See, also, Fiona Scott Morton et al, Equitable Interoperability: The “Super Tool” of Digital Platform Governance, Policy Discussion Paper No. 4, Digital Regulation Project, Yale Tobin Center for Economic Policy, July 13, 2021 at p. 15 (“Although interoperability can eliminate proprietary direct network effects, there remain indirect network effects even in a social network. For example, the more other users on the platform who are similar, the better the quality of their feeds will be (if the network learns from the behavior of other users and applies those results). If these forces are large, a small network may not be able to match the quality of a large one.”)

[6] From a data science perspective, Facebook’s newsfeed is a type of information filtering system. A system “designed to capture consumer attention through personalization features… These systems broady describe any application that filters only the most relevant/interesting information to a user, whether that be news, social media posts, movies, restaurants, apps, videos, or other products.” (Santesteban & Longpre, supra note 5, at 15)

[7] Josh Constine, How Facebook News Feed Works, TECHCRUNCH.COM (September 6, 2016), https://techcrunch.com/2016/09/06/ultimate-guide-to-the-news-feed/ (last visited Oct. 5, 2022).

[8] Santesteban & Longpre, supra note 5, at 16.

[9] Facebook has been rightly criticized for explicitly promoting content that leads to user addiction and extreme polarization. See, e.g. Ariel Hsieh et al, Addictive Social Media: Why We Need Regulation and Competition for Digital Platforms, PROMARKET (October 27, 2020), https://www.promarket.org/2020/10/27/addictive-social-media-need-regulation-competition-digital-platforms/ (last visited Oct. 9, 2022). This is in large part an effect of its advertising-driven business model and not a natural consequence of Facebook’s access to a lot of data. Without advertising, Facebook could use its rich data to provide feeds to its users that were in line with its users’ preferences, rather than in line with the goals of its advertisers. So, for example, if a user valued more moderate and reasonable language, a data-rich algorithm, not driven by purely commercial concerns, would generate a feed that contained a lot of moderate and reasonable language. This leads to a better matching of content to user, untainted by the advertising objective. Of course, other competing social networks that might wish to interoperate with Facebook might also rely on an advertising model and, as such, the danger exists that more data might lead to more addictive and polarizing content. However, the goal of interoperability is to generate choice for the consumer in terms of business models, and if enough users value a non-advertising model in social networking, interoperability would allow them to make that choice.

[10] Santesteban & Longpre, supra note 5, at 18.

[11] See, e.g. Cristian Santesteban, How to Prevent Big Tech from Hindering Pathbreaking Innovation in the Metaverse, PROMARKET (March 17, 2022), https://www.promarket.org/2022/03/17/big-tech-innovation-metaverse-competition/ (last visited October 5, 2022)

[12] More cynically, it may have been added there by knowing lobbyists to the dominant platforms that did not want to empower their interconnected rivals with the full force of their data.

[13] The FTC will have to develop a legal or regulatory framework to ensure that all interoperating firms abide by these limitations.

[14] I agree that neither the interconnecting entrant platform nor the covered platform should be able to share with other entities the data it obtains from other platforms, at least not without the user’s consent. A detailed description of how to design an interoperability regime that allows for data sharing with third parties and that takes into account privacy is laid out in a separate piece co-authored with Shayne Longpre. See Cristian Santesteban & Shayne Longpre, Invigorating Competition in Social Networking: An Interoperability Remedy that Addresses Data Network Effects and Privacy, CPI CHRONICLE (June 15, 2021), https://www.competitionpolicyinternational.com/invigorating-competition-in-social-networking-an-interoperability-remedy-that-addresses-data-network-effects-and-privacy-concerns/ (last visited October 5, 2022).

[15] To the extent that a user on a covered platform can restrict their own platform from using that data in any way while at the same time allowing other users on that platform to interact with their data, then the user’s preferences on the use restrictions would apply to the entrant platform as well.

[16] If a user moves to a new platform that interoperates with a covered platform, the user’s profile could in principle be ported over to the new platform, and the new platform would not be starting from scratch in understanding the preferences of the user (unless the user chooses to not have their data ported). This would be a stricter requirement than what I’m calling for here. It would require having some form of universal ID for an individual that would be valid across networks.

[17] My proposal is simply that on a going-forward basis, when a user switches from Facebook to a new entrant that interoperates with Facebook, Facebook should have to share with the new platform any data that the user interacts with that is generated by Facebook users. Further, the data symmetry principle should apply to Facebook as well in the sense that Facebook should be able to collect and use content from User A (now on the entrant platform) that Facebook users interact with.

[18] The data symmetry principle as stated in the text could be thought of as a weak data symmetry principle. There is a question of whether a stronger version should apply. The issue is whether Facebook should also share data about Facebook users “similar” to the user on the interconnecting platform, who are not at all related to that user. It is clear that the behavior of similar users influences what Facebook includes on a user’s feed. Those similar users need not be at all related to the user in question in terms of being friends or having friends in common. In those situations, should Facebook be forced to share data on the behavior of those similar users even if the user on the entrant platform never directly interacts with this content? A strong data symmetry principle would suggest yes, but I’m open to further discussion and research on this topic. More generally, a strong data symmetry principle would state that whatever kinds of data from users on its network a covered platform currently relies on to generate content for a particular user on its platform, that data should be shared with an entrant platform if that particular user were to reside on an entrant platform.

[19] Cristian and Frank could also be directly related if the two belong to the same Facebook group. In any interoperability proposal, a member of an entrant network should be able to join a group created in Facebook. The group’s invitations could extend to users beyond Facebook’s platform boundaries.

[20] As long as a user on an interconnecting platform is viewing content from Facebook users, the interconnecting platform should be able to collect and use that information from Facebook users to learn about its own users’ preferences. This is NOT equivalent to saying that the entrant platform should be able to collect and use the same information that Facebook does for any of its users to learn about their preferences. That would be following a strong data symmetry principle that I do not currently advocate in this piece.  See discussion in supra note 18.

[21] Even if Cristian doesn’t directly like the post or comment on it, the entrant platform could still learn about Cristian’s preferences. Recall that how long a user lingers over a post is also relevant data that social media platforms collect and learn from.

[22] This is stricter than an earlier proposal of mine with co-author Shayne Longpre that would have allowed monetization as long as i) the Facebook user allowed it on Facebook, and ii) the entrant platform allowed monetization of at least some of its own users’ data. Santesteban & Longpre, supra note 14.

[23] This raises the case of what to do if a user has a profile on two interconnected social networking platforms. Could a user become friends with themselves? What if the user has strict privacy restrictions on one platform and loose ones on the other? One response based on the data symmetry principle is that a network receiving the data shared by the strict (loose) platform must maintain strict (loose) privacy controls on that data.