Managing Multiple Extended Reference and Outlier Patterns

I am trying to conceptualize how to build an application where users can follow other users and have followers of their own. I know this topic has been covered in some other threads, but what I am really concerned about is determining how to best manage multiple extended references across multiple collections.

For example, when building my application I took into account “what questions am I trying to answer?” to build my collections for effective queries. Since the application will allow users to click a button to see all their followers (just name and profile picture), I thought an extended reference pattern would suffice, only duplicating data that rarely changes. I also wanted to account for any outlier patterns (say if a user has 100 or more followers) to make sure I am keeping my document size down on the rare chance there is an outlier. Example model below…

  • Also silly side note, but I am no longer using Mongoose in case someone recommends that I don’t *

User Model

    const userSchema = mongoose.Schema(
        { 
            ... Other Data...
            followers: [
              {
                userId: { type: mongoose.Schema.Types.ObjectId, required: true, ref: 'User'},
                userName: { type: String, required: true },
                followMeDate: { type: Date, required: true },
                userProfilePicture: { type: String, required: true}
              }
            ],
            countFollowers: { type: Number, required: true, default: 0 },
            hasOutlierFollowers: { type: Boolean, required: true, default: false },
            following: [
              {
                userId: { type: mongoose.Schema.Types.ObjectId, required: true, ref: 'User'},
                userName: { type: String, required: true },
                followingDate: { type: Date, required: true },
                userProfilePicture: { type: String, required: true}
              }
            ],
            countFollowing: { type: Number, required: true, default: 0 },
            hasOutlierFollowing: { type: Boolean, required: true, default: false },
        }
    )

In the model described above, I am able to accomplish the following:

  1. Track the total followers and following when they change with the ‘count’ fields instead of doing some aggregation every time the field is pulled.

  2. Limit the number of queries to 1 in order to see what users you are following or who is following you. The UI would display the user’s name and profile picture and can be sorted by most recent followers. Although this is duplicating data, it is duplicating data that will rarely change.

  3. Manage collection size and performance since once the first 100 followers / following will be pulled initially. In the rare event that someone has more followers or is following more users there is a separate outliers collection that is storing the data.

Followers / Following Models

    const outlierUserFollowersSchema = mongoose.Schema(
        {
          userId: { type: mongoose.Schema.Types.ObjectId, required: true, ref: 'User'},
          followers: [
            {
              userId: { type: mongoose.Schema.Types.ObjectId, required: true, ref: 'User'},
              userName: { type: String, required: true },
              followMeDate: { type: Date, required: true },
              userProfilePicture: { type: String, required: true}
            }
          ],
     )

    const outlierUserFollowingSchema = mongoose.Schema(
        {
          userId: { type: mongoose.Schema.Types.ObjectId, required: true, ref: 'User'},
          following: [
            {
              userId: { type: mongoose.Schema.Types.ObjectId, required: true, ref: 'User'},
              userName: { type: String, required: true },
              followingDate: { type: Date, required: true },
              userProfilePicture: { type: String, required: true}
            }
          ],
        }
    )

So my question is, is this okay? If a user changes their name or picture, I would need to update several collections. What if I had even more collections that had extended references of the user’s name and picture?

Or, if this is really a concern of mine, should I increase the number of users within the main user collection array of following / followers to limit the extra queries?

What if I can’t hold more user data due to size constraints (doubtful, but just curious).

Hi @Jason_Tulloch ,

The extended reference pattern with outliers and counts sounds like a good approach that I would take.

If user changes their profile pictures or username (should be super rare) you can index the userId in the followers and following fields. This way you can eventually asynchronously update all followers sub data in a side process.

I believe that if in a following list if an image will be a few moments out dated users will not notice.

Another interesting approach is duplicating a “following image url” this url will be generic for each user having the moat up to date picture overwritten as users change profiles. This way you can keep users image history in the user object but the reference will always point to the same url with the most up to date (https://myhosting.com/user123_followingprofile.png)

Of course adding more references complex logic and performance impact so each change should be accounted for itself

Let me know if that make sense.

Pavel

@Pavel_Duchovny Thanks for taking time to respond. I was pretty confident in my approach but really appreciate the validation that my mind thought process was heading in the right direction.

In regards to profile images, I agree it may be wise to track user image history. Just to confirm, your suggestion is to have duplicated image URLs across any document with the users’ profile picture. And then whenever a user updates their profile picture the new profile picture replaces the image but not the URL, avoiding any additional writes? In addition, a separate array of profile picture URL strings can be maintained with historical versions of the users profile picture. That definitely makes a lot of sense to me if I am understanding correctly.

Hi @Jason_Tulloch ,

Exactly, you got my point. The url remains juwt the contents of the image overwrites.

It avoids lots of heavy updates considering that profile pics do update from time to time.