在 firebase 上构建数据的最佳方法是什么?
我是 firebase 的新手,我想知道在它上面构建数据的最佳方式是什么.
I am new to firebase and I want to know what's the best way of structuring data on it.
我有一个简单的例子:
我的项目有申请人和申请.1个申请人可以有多个申请.我如何将这 2 个对象与 firebase 相关联?它是否像关系数据库一样工作?或者在数据设计方面需要完全不同的方法?
There are Applicants and Applications on my project. 1 applicant can have several applications. How can I relate this 2 objects on firebase? Does it work like a relational database? Or the approach needs to be completely different in terms of data design?
UPDATE:现在有一个 关于数据结构化的文档.另外,请参阅这篇关于 NoSQL 数据结构的优秀文章.
UPDATE: There is now a doc on structuring data. Also, see this excellent post on NoSQL data structures.
与 RDBMS 相比,分层数据的主要问题是嵌套数据很诱人,因为我们可以.通常,尽管缺少连接语句和查询,您还是希望在某种程度上规范化数据(就像使用 SQL 所做的那样).
The main issue with hierarchical data, as opposed to RDBMS, is that it's tempting to nest data because we can. Generally, you want to normalize data to some extent (just as you would do with SQL) despite the lack of join statements and queries.
您还希望在关注读取效率的地方非规范化.这是所有大型应用程序(例如 Twitter 和 Facebook)都使用的一种技术,尽管它违反了我们的 DRY 原则,但它通常是可扩展应用程序的必要功能.
You also want to denormalize in places where read efficiency is a concern. This is a technique used by all the large scale apps (e.g. Twitter and Facebook) and although it goes against our DRY principles, it's generally a necessary feature of scalable apps.
这里的要点是您要努力进行写入以使读取变得容易.将单独读取的逻辑组件保持分开(例如,对于聊天室,不要将消息、关于房间的元信息和成员列表都放在同一个地方,如果您希望以后能够迭代组).
The gist here is that you want to work hard on writes to make reads easy. Keep logical components that are read separately separate (e.g. for chat rooms, don't put the messages, meta info about the rooms, and lists of members all in the same place, if you want to be able to iterate the groups later).
Firebase 的实时数据与 SQL 环境之间的主要区别在于查询数据.没有简单的方法可以说SELECT USERS WHERE X = Y",因为数据的实时性(它不断变化、分片、协调等,这需要一个更简单的内部模型来检查同步的客户端)
The primary difference between Firebase's real-time data and a SQL environment is querying data. There's no simple way to say "SELECT USERS WHERE X = Y", because of the real-time nature of the data (it's constantly changing, sharding, reconciling, etc, which requires a simpler internal model to keep the synchronized clients in check)
一个简单的例子可能会让你处于正确的心态,所以这里是:
A simple example will probably set you in the right state of mind, so here goes:
/users/uid
/users/uid/email
/users/uid/messages
/users/uid/widgets
现在,由于我们处于分层结构中,如果我想迭代用户的电子邮件地址,我会这样做:
Now, since we're in a hierarchical structure, if I want to iterate users' email addresses, I do something like this:
// I could also use on('child_added') here to great success
// but this is simpler for an example
firebaseRef.child('users').once('value')
.then(userPathSnapshot => {
userPathSnapshot.forEach(
userSnap => console.log('email', userSnap.val().email)
);
})
.catch(e => console.error(e));
这种方法的问题是我刚刚强迫客户端也下载所有用户的messages
和widgets
.如果这些东西都没有数以千计,那就没什么大不了的.但对于 10k 用户,每个用户有超过 5k 条消息来说,这是一个大问题.
The problem with this approach is that I have just forced the client to download all of the users' messages
and widgets
too. No biggie if none of those things number in thousands. But a big deal for 10k users with upwards of 5k messages each.
所以现在分层实时结构的最佳策略变得更加明显:
So now the optimal strategy for a hierarchical, real-time structure becomes more obvious:
/user_meta/uid/email
/messages/uid/...
/widgets/uid/...
在这种环境中非常有用的另一个工具是索引.通过创建具有特定属性的用户索引,我可以通过简单地迭代索引来快速模拟 SQL 查询:
An additional tool which is extremely useful in this environment are indices. By creating an index of users with certain attributes, I can quickly simulate a SQL query by simply iterating the index:
/users_with_gmail_accounts/uid/email
现在,如果我想为 gmail 用户接收邮件,我可以执行以下操作:
Now if I want to, say, get messages for gmail users, I can do something like this:
var ref = firebase.database().ref('users_with_gmail_accounts');
ref.once('value').then(idx_snap => {
idx_snap.forEach(idx_entry => {
let msg = idx_entry.name() + ' has a new message!';
firebase.database().ref('messages').child(idx_entry.name())
.on(
'child_added',
ss => console.log(msg, ss.key);
);
});
})
.catch(e => console.error(e));
我在另一篇关于非规范化数据的 SO 帖子中提供了一些细节,所以也检查一下.我看到 Frank 已经发布了 Anant 的文章,所以我不会在这里重复,但它也很好读.
I offered some details in another SO post about denormalizing data, so check those out as well. I see that Frank already posted Anant's article, so I won't reiterate that here, but it's also a great read.